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Abstract. This report explores methods for determining the pose of a grasped 
object using only limited sensor information. The problem of pose determination is 
to find the position of an object relative to the hand. The information is useful when 
grasped objects are being manipulated. The problem is hard because of the large space 
of grasp configurations and the large amount of uncertainty inherent in dexterous hand 
control. By studying limited sensing approaches, the problem’s inherent constraints can 
be better understood. This understanding helps to show how additional sensor data can 
be used to make recognition methods more effective and robust. 

This report introduces three approaches for pose determination. The first is based 
on an interpretation tree representation of possible object feature placements on finger 
segments. The tree is built in real-time based on the hand’s configuration and an object 
model. The method is highly efficient as it only explores consistent paths through the 
tree. 

The second approach is based on storing past recognition experiences in a memory. 
Recognition becomes a fast lookup operation. The number of possible grasps stored in 
the memory can be kept small by exploiting a grasp acquisition strategy constraint. The 
memory can be further compacted by using interpolation schemes. Various approaches 
for organizing, filling, and using the recognition memory are investigated. 

The third approach explores how additional contact sensing information can be used 
for recognition. Fingertip force sensors are used to measure contact surface normals. 
An object’s pose can be refined by fitting these normals to a model of the object. Since 
contact sensors produce local readings, and since the fitting process requires global 
information, calibration becomes an important issue. The method explored provides a 
way to self calibrate for sensor orientation errors. 

An important theme throughout the report is that knowledge of the strategy the 
hand uses to acquire an object provides a very important recognition constraint. While 
a dexterous hand has a very large configuration space, the number of unique grasps that 
correspond to a given strategy are limited. Knowledge of the grasp acquisition strategy 
makes pose determination easier. 
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Introduction 


Chapter 1 


We are certainly very good at performing manipulation with our hands. Along with a 
set of tools, we can make our hands perform an enormous variety of tasks. What makes 
them so versatile? A combination of factors are probably at play. Our fingers can move 
quickly and accurately. They can perform both delicate operations or exert powerful 
grasping forces. Our touch sensors provide a wealth of information that undoubtedly 
helps make them so useful. 

Dexterous robot hands will ultimately give machines some of the capabilities that 
our own hands have, helping to make them more useful. A hand is most appropriate for 
performing manipulations in unstructured environments, particularly where it is unde¬ 
sirable for humans to work. An example of such a task is the cleanup of hazardous sites. 
The work required to contain the tragic accident at Chernobyl needlessly exposed thou¬ 
sands of people to deadly radiation. Potentially, mobile robots equipped with dexterous 
hands could be used in similar situations. 

This report addresses a small part of the overall problem of giving a robotic hand 
human level dexterity, the problem of determining the pose of a grasped object. 
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Figure 1.1: Pose determination. The relationship between recognition, determination, and 
refinement. 


1.1 Pose Determination 

The class of problems studied in this report can be stated simply: Given a hand grasping 
an object, and given models of a small number of objects, determine the object, its 
position, and its orientation (see Figure 1.1). These problems are often referred to as 
small set recognition and pose determination. Recognition and determination are related 
problems, and it is sensible to study them together. For convenience, in this report the 
term pose determination often refers to these problems collectively. 

1.1.1 Relevance of the Studied Problem 

Why is the pose determination problem interesting? A hand is used to grasp an object 
in order to manipulate the object. The manipulation can be simple, such as moving 
the object from one location to another, or it can be complex, such as grasping a tool 
for performing an assembly operation. In either case, knowledge of the location of the 
object within the hand is useful for insuring that the manipulation is performed correctly. 
While certain manipulation strategies might not require pose information, it is not hard 
to imagine that the information would be beneficial. 

The problem is hard because of the large space of grasp configurations and the large 
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Figure 1.2: Object and its grasp. The object, shown on the left, is grasped by a hand, shown 
on the right. For pose determination, only the hand shape is known. The one or more positions 
of the object that are consistent with that hand shape are desired. 

amount of uncertainly inherent in dexterous hand control. By studying limited sensing 
approaches, the problem’s inherent constraints can be better understood. This under¬ 
standing helps to show how additional sensor data can be used to make determination 
methods more effective and robust. 

1.1.2 Example of the Studied Problem 

This section describes a typical pose determination problem. The inputs to the 
problem include a model of the robotic hand, a model of the object, and the hand’s joint 
angle positions after the object has been grasped. The object is shown in Figure 1.2, on 
the left. A grasp of the object is shown on the right. For pose determination, the hand 
shape is known, while the object position is unknown, and is to be found. Figure 1.3 
shows the hand shape, along with the poses that are consistent with the shape. The only 
sensor information used for this determination is the set of joint angle readings from the 
hand. No additional sensor data is required to obtain the set of solutions that are shown 
in the figure. In this case, since the solution set contains the object’s actual pose along 
with several other consistent candidates, additional sensor data would be required for 
unambiguous determination. The first row of candidate poses, for example, could be 
ruled out if information about the joint torque readings were available. If the joints are 
known to be curled as far forward as possible, the links must make the necessary contacts 


Chapter 1 Introduction 



Figure 1.3: Hand shape and potential poses. The object poses shown here are consistent with 
the hand’s shape. 

with the object to constrain the finger motion. The second row of poses shown in the 
figure are entirely consistent with both the geometric constraints and the joint torque 
data. Additional sensor information would be needed to distinguish between them. 


1.1.3 Potential Information Sources 

The information available for pose determination can be grouped into that from the con¬ 
straints inherent in the problem and that from sensors. An understanding of a problem’s 
constraints often leads to powerful techniques for its solution. An understanding of how 
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the constraints can be used to solve a problem helps to interpret sensor data. With this 
in mind, this report explores the pose determination problem by relying heavily on the 
problem’s constraints, and using little external sensor data. This section explores the 
potential information sources available, and explains why the ones that are used in this 
report were selected. 

Geometric Constraints 

Geometric constraints provide a powerful inherent source of information for determina¬ 
tion. The methods described in this report exploit both the geometry of the objects and 
the shape of the hand. 

Another inherent constraint is provided by the grasping strategy. A particular grasp¬ 
ing strategy limits the possible hand shapes that can occur. As will be seen, this par¬ 
ticular constraint can be directly exploited for determination. 

Contact Sensors 

A variety of sensors can be used to provide the raw data for pose determination. Perhaps 
the most basic is kinesthetic (joint position) sensing. This type of information is readily 
available on dexterous hands, as joint position sensing is usually required for their low 
level servo control. Cutaneous (tactile) sensing is more advanced, from a hardware 
standpoint. The term haptics is often used to refer to these hand-based senses. Visual 
sensing can also be used, but as will be seen, it is less desirable than haptics for many 
applications. 

Tactile sensors give a small window of information at their contact point, which 
provides data suitable for local methods. Curvature detection is an example of a local 
problem that can use tactile sensor data. Kinesthetic sensing can be used to obtain the 
hand’s shape, which provides data suitable for global methods. Pose determination is 
an example of a global problem that can use hand shape data. 



6 


Chapter 1 Introduction 


Vision Sensors 


Unlike tactile sensors, vision can give a large window of information. With a camera in 
the proper location, one view can provide far more information than could be obtained 
even by multiple probes with tactile sensor. Thus, vision potentially provides an easy 
way to acquire global features. 

There are a number of reasons why vision is not a particularly good source of infor¬ 
mation for pose determination. The most significant problem is the potential occlusion 
of the grasped object by the hand. For relatively small objects the hand can obscure 
many, perhaps all, of the object’s visible features. In addition, discrimination between 
features of the hand and features of the object becomes difficult as more of the object 
is occluded. 

Typically, the pose of an object with respect to the position of the hand is required. 
Cutaneous and kinesthetic sensing take readings in the hand frame, which is desirable. 
Readings from a vision system would usually be made from a different frame, requiring 
additional calibration. 

The resolution of vision systems may not be adequate for tasks that require extremely 
accurate pose determination. The advantage that vision offers, that of being a global 
sense, works against the requirement for precision. Since haptic information sources take 
readings at the site of contact between the hand and the object, they can potentially 
give better readings than a remote vision system could. 

Vision is a useful sense for determining an object’s position prior to being grasped. 
Unfortunately, when the object is touched, it usually moves in a hard to predict way. 
Because of this, the pre-grasp position does not always give a good indication of the 
object’s final pose. 
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Information Used by the Studied Approaches 

The pose determination algorithms presented in this report use only the inherent prob¬ 
lem’s constraints, along with the addition information provided by joint angle and joint 
torque sensors. As explained above, the motivation for this limited information approach 
is twofold. A full exploitation of the basic constraints inherent in the problem leads to a 
better understanding of the problem. Methods that can work on kinesthetic information 
will facilitate this understanding. From a practical standpoint, kinesthetic information 
is the most readily available data source today. This makes it even more desirable to 
develop methods that work with this type of data. 

It is important to note that while the techniques explored in this report rely only on 
minimal sensor data, they can easily incorporate, and will benefit from, the additional 
information that tactile sensors provide. The constraints that are exploited are useful 
independent of the external sensor data available. For example, more data can be used 
to reduce ambiguity and to provide redundancy that would make the methods more 
robust. 


1.1.4 Approaches 

This report introduces three approaches for pose determination, as diagrammed in Fig¬ 
ure 1.4. The first approach is constraint-based, and uses an interpretation tree repre¬ 
sentation of possible object feature placements on finger segments. The tree is built in 
real-time based on the hand’s configuration and an object model. The method is highly 
efficient as it only explores consistent paths through the tree. 

The second approach is memory-based, and uses past experiences for determination. 
Determination becomes a fast lookup operation. Possible grasps can be kept sparse 
by exploiting a grasp acquisition strategy constraint. The memory can be compacted 
using interpolation schemes. Various approaches for organizing, filling, and using the 
determination memory are investigated. 
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Figure 1.4: Algorithms and their relationship. The algorithms studied in this report , and 
their relationship, are diagrammed in this figure. 

The third approach is sensor-based, and explores how additional information can be 
used for determination. Fingertip force sensors are used to find contact surface normals. 
An object’s pose can be refined by fitting these normals to a model of the object. Since 
contact sensors produce local readings, and since the fitting process requires global 
information, calibration becomes an important issue. The method explored provides a 
way to self calibrate for sensor orientation errors. 

As will be seen, the approaches studied are straightforward yet powerful. This work 
shows that simple methods, using basic constraints and limited sensor data can solve a 
wide class of pose determination problems. Showing this to be true is the motivation 
for this work. 
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1.2 Grasp Planning and Object Acquisition 

In this report , a distinction is made between objects that have been grasped using a 
plan, and objects that have been acquired without a plan. Typically, a grasp planner 
is given as input the position of the object to be grasped. The plan generated has a 
certain tolerance to error in object placement, though undoubtedly the grasp will fail 
if the object’s actual position is far from the modeled position. The error tolerance 
of a particular grasp is often a factor that is considered in the planning process. A 
grasp acquirer works without knowledge of an object’s position. Instead, a generic hand 
motion, perhaps modified by sensor feedback, is used to acquire the object. 

There are many examples of problems that are suitable for planned grasps. In par¬ 
ticular, when the world is well known, or when external sensors like vision can identify 
the location of objects, the use of a planner is appropriate. Since planning can be a slow 
process, it is helpful if the world is relatively static. Certainly, if it is changing faster 
than a planner can plan, there will be problems. As uncertainty in the world increases, 
planning becomes more difficult. Planners can be designed to handle uncertainty up to 
a point (see Mason [70]). In the limiting case, where nothing specific is known about 
the world, there is little reason to plan. 

Grasp acquisition is useful for problems where planning is inappropriate. This include 
when using a planner may be difficult, because there is too much uncertainty, or when 
it may be impossible, because there is no information at all. Haptic exploration is an 
important class of motions where planning is not appropriate. Reaching into a bin, 
identifying a particular item, and grasping it, is an example of such a motion. A more 
practical example includes tool retrieval. NASA is interested in a device to capture 
free-floating tools and other small objects from space. An astronaut might accidentally 
release a tool during a space walk, creating a flying obstacle that could collide with a 
spacecraft. It is unlikely that planning a grasp to retrieve the floating object would be 
appropriate. Using an acquisition strategy designed to capture the class of objects that 
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can potentially be released seems more reasonable. 

The pose determination strategies studied in this report are useful for finding the 
position of an object that as been acquired, where no a priori knowledge of the object’s 
position is available. The methods are also suitable for distinguishing among a small set 
of objects that are being acquired. An easy way to do this is to simply invoke the pose 
determination strategy once for each object in the set. Ideally, the strategy will find a 
pose only for the correct object. Finding multiple objects implies that the information 
being used for pose determination is just not enough to distinguish among the objects. 

Pose determination is also useful for verifying that a planned grasp has been executed 
correctly. Because of errors, a planned grasp is never executed exactly as intended. 
Verification of the plan to insure that it has completed satisfactorily is a useful step to 
perform before proceeding to the next manipulation. Ideally, the planned grasp would be 
executed in a closed loop, where errors are detected in real-time, and recovery strategies 
are part of the plan. Post-execution verification is useful, though not necessarily the 
best approach. 


1.3 Hand Design and Shape Information 

The design of a hand directly affects its haptic information content. For example, 
hands with more links can better conform to the shape of an object, providing more 
clues as to the object’s shape. Double jointed hands provide more shape information 
than a single jointed hand, as shown in Figure 1.5. Some of these issues will be examined 
in subsequent chapters of this report . 

1.4 Assumptions 

The work in this report makes certain assumptions about the world. The methods that 
are described are not, in all cases, limited by these assumptions. Rather, the methods 
have been tailored to best fit the natural constraints that these assumptions provide. 
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Figure 1.5: Double joint grasps. This figure compares grasps using a double joint hand with 
a single jointed hand. The double jointed grasps are shown on the right. 

As each method is presented, its limitations and its best applications are discussed. 

The objects in the world are assumed to be polyhedral. This permits simpler models 
and algorithms to be used compared to objects represented with a more general scheme. 
In many cases, the ideas presented can be extended to a fully general representation. In 
some cases they cannot. Each chapter discusses this issue in some detail. 

By assuming that objects are resting on a table-top before they are grasped, it is 


12 


Chapter 1 Introduction 


possible to use two dimensional methods to solve certain three dimensional problems. 
This table-top constraint reduces the possible configuration space of the object from 
three dimensions to n spaces in two dimensions, where n is the number of object faces. 
This assumes that an object can only rest on one of its stable faces. For an uncluttered 
workspace, this is probably a reasonable assumption. The pose determination methods 
described in this report are implemented in two dimensions, and can be directly used 
for three dimensional problems using this added constraint. It is important to note that 
full three dimensional extensions to the methods are possible, and are explored in some 
detail. 

One additional assumption made is that the world contains invariants that can be 
exploited by determination schemes. Object models are, of course, invariant. Their 
shape, compared with a hand’s shape, provides important clues that can be used for 
pose determination. Chapter 5 exploits the invariant of gravity. Gravity provides a 
force that can be used to relate different coordinate systems together, and is used to 
refine an initial estimate of an object’s pose to a more precise estimate. 


1.5 Contributions 

The most important contributions of this work can be summarized as follows: 

• It is shown that a hand’s shape often provides enough information to uniquely 
determine the pose of a grasped object. 

• It is shown that knowledge of the grasp acquisition strategy provides a useful 
recognition constraint. 

• An efficient algorithm for finding object poses based on tree pruning is presented. 

• A fast algorithm for finding object poses based on experiences stored in a memory 
is presented. 
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• Minimal sensing that determines the links that are in contact with an object is 
shown to be very effective for reducing ambiguity in the results. 

• Experimental results explore the tradeoff in recognition power obtained by adding 
links or sensors to a hand. 

• A method for refining contact surface normals from fingertip sensors is presented. 
These contact normals can be used for pose determination. 

• Experiments indicate that it is hard to extract global information from the local 
measurements obtained from fingertip sensors. 

The methods that are explored in this report are not complex. In fact, their simplicity 
can be considered an important contribution in itself. This report will show that pose 
determination can be accomplished with simple methods and minimal sensing. While 
this does not imply that more complex methods are unnecessary, it does indicate that the 
constraints that are used lend themselves to straightforward determination algorithms. 

1.6 Overview of the Report 

Before describing the pose determination and refinement algorithms, Chapter 2 overviews 
hands, sensing, grasping, and recognition. The chapter provides a general review of re¬ 
search related to pose determination. Each subsequent chapter mentions work more 
directly related to the particular algorithms being presented. Following the review, the 
next three chapters are organized around the diagram shown in Figure 1.4. Chapter 3 de¬ 
scribes a constraint-based determination method. Chapter 4 describes a memory-based 
determination method. The constraint-based method can be used as the grasp simulator 
that is required for populating the memory used by the approach in this chapter. These 
two chapters present methods for finding a pose estimate. Chapter 5 describes a tech¬ 
nique for refining pose estimates by using additional sensor data. Finally, conclusions 
and directions for future research are discussed in Chapter 6. 









Hands, Sensors, Grasping and, 
Recognition 


Chapter 2 


2.1 Introduction 

This chapter reviews relevant research in hands, sensing, grasping and recognition. The 
topics reviewed cover such a broad area because of their unavoidable interrelationships. 
As this report attempts to show, it is improper to consider one of these aspects without 
considering them all. To build a hand for recognition — mechanical design, sensor design, 
and grasping strategies must all be considered. In each of the next sections, relevant 
work from these areas is described, with special attention given to the interrelations 
between them. Specific discussion of how these methods directly relate to the work 
described in this report is deferred to the related work sections in each of the subsequent 
chapters. 


2.2 Dexterous Hands 

Dexterous robotic hands have been studied at least since the early 1960’s. Some of the 
earliest work is by Tomovic [104] and Okada [79]. Other hands have been developed 
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Figure 2.1: The Salisbury-JPL hand, from Salisbury [90]. 


by Salisbury [90], Jacobsen et al. [57], Bologni et al. [14], Abramowitz et al. [1], and 
Caporali and Shahinpoor [17]. This section discusses a number of these hands in more 
detail. The section is loosely organized around the motivating design principles used by 
the researchers. 
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2.2.1 Mobility Design 

A three-fingered hand was designed by Salisbury [90]. In his work, a mobility analysis of 
various kinematic configurations was performed. The actual design he selected optimized 
certain criterion according to this consideration. Thus, Salisbury’s primary goal was to 
achieve a mechanical design that was well suited for grasping. The hand is actuated by a 
servo-motor pack connected to the joints using bicycle-style cables. Figure 2.1 diagrams 
the configuration that Salisbury selected. 

2.2.2 Anthropomorphic Design 

Jacobsen et al. [57] developed the Utah-MIT hand based on a belief that an anthro¬ 
pomorphic design has certain inherently desirable traits. The versatility of the human 
hand is proof that its design is a good one. Using this principle, the four fingered, four 
jointed hand shown in Figure 2.2 was developed. Special attention was given to the 
tendon and actuator design. A Kevlar material was used for the tendons, giving them 
both strength and flexibility. Jacobsen believes that the actuator component is at least 
as crucial as the mechanics. In the case of the Utah-MIT hand, the pneumatic actua¬ 
tors provide a very human-like natural compliance. Their performance is well suited for 
grasp acquisition routines, as will be discussed later in this report . 

2.2.3 Behavioral Design 

While both the Salisbury-JPL and the Utah-MIT hands were designed to have a rea¬ 
sonable kinematic configuration for grasping, their performance was more optimized for 
mobility, or said another way, dexterity. As an alternative, Greiner’s [40] Prehensile Ac¬ 
quisition Linkage Mechanism (PALM) is specifically designed for grasping (Figure 2.3). 
Mobility is not considered as important. The device has one active and two passive 
degrees of freedom. The actively controlled tendon can be used to close the hand. The 
same mechanism will passively curl the hand around objects that are pushed into its 
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Figure 2.2: The Utah-MIT hand, from Jacobsen et al. [57]. 

links. In essence, the hand implements a grasping strategy in its hardware. As will be 
seen, the methods described in this report are very suitable for giving a device like the 
PALM recognition capabilities. The fixed grasping strategy that this hand uses could 
be exploited by the recognition algorithms. 


2.2.4 Other Design Considerations 

There are other reasons given for particular hand designs. Hirose and Umetani [49] de¬ 
veloped a soft, snake-like grasper. The University of Bologna hand (Bologni et al. [14]) 
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Figure 2.3: The Greiner PALM, from Greiner [fO]. 

was designed to be “at least as useful” as a conventional robotic end-effector, while also 
having “micro-manipulation” capabilities. Abramowitz et al. [1] cited the recent devel¬ 
opment of tactile sensors as a primary reason for building the Pennsylvania Articulated 
Mechanical Hand. Sensor equipped hands, he reasoned, are good for three dimensional 
perception. Rather, it seems likely that hands will be most useful as manipulation de¬ 
vices, where the recognition that they perform is directly related to their manipulation 
needs. Sensor designs have been motivated by hand designs, not the other way around. 
Nonetheless, Abramowitz made the good point that hands have an important role in 
sensing, or haptics. This issue will be explored more later in this section. 

Another aspect of hand design that has been considered are the properties of the 
fingertips themselves. Brockett [16] argued that rheological surfaces are good for grasp¬ 
ing. Cutkosky et al. [26] analyzed a number of materials to find their suitability for 
fingertip surface coverings. The design of a covering becomes more complex when the 
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fingers are equipped with tactile sensors. Their surface must protect the sensors while 
not interfering with the transduction process. 

From a mechanical standpoint, robotic hands are advanced and are highly dexterous. 
However, their actuation system are much too bulky. The Utah-MIT hand uses a pneu¬ 
matic actuation system with flexible Kevlar tendons. While the actuators themselves 
are fairly compact, a large external air source is required for power. The entire actuator 
pack is also too bulky for mounting on most robots. Chiarelli and De Rossi [20, 21] and 
others are studying muscle-like fibers that may be the basis for a far more advanced and 
compact actuation system of the future. 

2.3 Touch Sensors 

Touch sensors are thought to be important for many aspects of dexterous hand manip¬ 
ulation. Certainly, an accurate measurement of contact forces is helpful when grasping 
delicate objects. Feature detection can provide useful information for object recognition 
and pose determination. Slip detection is helpful for monitoring a grasp to insure that 
it is being properly maintained. This section reviews tactile sensing, briefly exploring 
human sensors, the mechanics of sensing, and a variety of robotic devices. 

2.3.1 Human Sensors 

A goal of tactile sensor designers has always been to duplicate the capabilities of the 
human sensing system. Human touch sensors have many very desirable properties. Per¬ 
haps the most important one of all is that they are compact and reliable. There are 
thousands of mechanoreceptors in the small confines of the fingers. Humans have four 
types of touch sensors, each which is specialized for a particular response (Figure 2.4). 
The Merkel and Ruffini receptors have some static touch response, while the Meissner 
and Pacinian respond better to a changing stimulation. Human touch sensors can de¬ 
tected pressure, shear, and slip (Johansson et al. [59, 60, 61]). Measurements of the 
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Figure 2.4: Cross section of human skin. 

performance of the human tactile system made by Vallbo and Johansson [106] indicate 
that static two point discrimination between 1.5 and 2.2 mm is possible. The sensations 
that we are so capable of feeling have not yet been entirely duplicated by robotic devices, 
although recent advances are starting to close the performance gap. 

2.3.2 Mechanics of Transduction 

All tactile sensors have the common task of detecting some mechanical phenomenon and 
transducing it to a measurable signal. Ultimately, the signal is converted to digital data 
for processing by a computer. The earliest tactile sensors used various phenomenon to 
generate an analog electrical signal. Later devices translated mechanical phenomenon 
to an optical signal. More recent devices directly convert the sensed phenomenon to 
a digital signal, bypassing the analog signal stage. See Usher [105] for an overview of 
sensing, including the physical effects that are available for use in transduction. 
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The most common property that sensor designers have attempted to detect is pres¬ 
sure. An array of pressure sensors are frequently used to find the contact profiles of 
objects pushed into them. This would be useful for determining, for example, if an 
object vertex, edge, or face is making contact with the sensor. Pressure sensors of this 
type have proven to be the easiest to build. 

Other phenomenon that are interesting to detect include shear and torque at the 
contact point. Measurement of these properties is helpful for determining if slippage 
is likely to occur. Unlike pressure, shear and torque sensors have been hard to build. 
There are only a few examples of sensors of this type in the literature. 

Another interesting sensor has been used to detect the thermal properties of a mate¬ 
rial. By applying heat and measuring the resulting temperature gradients, the contact 
material can often be identified. In addition, slip can be detected by measuring temper¬ 
ature changes. As warmer material slides past the sensor, the cooler material that has 
not yet been heated can be detected. 

2.3.3 Robotic Touch Sensors 

A review of tactile devices starts with the early work of Inoue and Binford. In 1972, 
Inoue [54, 55] used an array of switches made using a foam rubber separator and con¬ 
ductive paper. Hill [45] also performed early research with a sensor that used a simple 
array of switches. Binford [12] pursued several approaches, including semiconductor 
strain gauges and various pressure sensitive paints and rubber polymers. Since then, a 
wide variety of technologies have been employed to reduce sensor size, to make them 
conform to curved mounting surfaces, and to improve their reliability and sensitivity. 
The technologies that have been tested include: 

1. Resistive: pressure is detected from a change in a material’s resistance (Pur- 
brick [84], Hillis [47], Grotenhuis and Moore [43], Snyder and St. Clair [98], and 
Bastuscheck [7]). 
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2. Capacitive: pressure is detected from a change in the thickness of a capacitor’s 
dielectric (Boie [13], Siegel et al. [94], Fearing [33], and Jacobsen et al. [56]). 

3. Magnetic: compression, rotation, and shear are detected from the change in orien¬ 
tation of magnetic dipoles (Hackwood et al. [44], Checinski and Agrawal [19], and 
Kinoshita [65]). 

4. Optical: compression is detected from a change in intensity of light (Schneiter and 
Sheridan [92], and Begej [8]). 

5. Semiconductor: pressure is transduced using resistive, capacitive, or optical sensor 
elements integrated onto a chip (Raibert [85], Raibert and Tanner [86], Chun and 
Wise [22], and Tise [102]). 

6. Polyvinyledene Fluoride: pressure is detected from an electric response generated 
when a Polyvinyledene Fluoride material is disturbed (Kinoshita et al. [66], and 
Dario et al. [28]). 

7. Ultrasonic: pressure is detected by using an ultrasonic pulse to measure the change 
in thickness of a material (Grahn and Astle [39]). 

8. Thermal: material is identified based on its thermal conduction properties (Siegel 
and Simmons [96], and Russell [88]). 

Other interesting recent devices include a slip detector by Howe and Cutkosky [51], and 
a shear sensor by Novak [78]. As can be seen, the list of tactile sensing technologies is 
large and varied. Nonetheless, major problems which prevent the use of tactile sensors 
for all but a few specialized applications still remain. 

Today, robotic hands such as those designed by Salisbury [90] and Jacobsen [57] 
are increasingly being equipped with contact sensors. Jacobsen [56] described a sensing 
system suitable for covering most surfaces of the Utah-MIT hand with binary contact 
detectors. Dario et al. [27] has mounted his sensor on a robotic finger. Fearing [34,35] has 
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mounted a capacitive-based fingertip sensor on the Salisbury hand. Brock and Chiu [15] 
also developed a fingertip sensor that has been mounted on the Salisbury hand. Their 
sensor is the one that is used for the experiments conducted in Chapter 5. 

Based on the experiences of tactile sensor developers, it is reasonable to conclude that 
human-like sensor performance is not yet achievable. The cost of building the sensor 
systems is also unknown, and is likely to be large. 

2.4 Grasp Planning and Analysis 

Grasping and pose determination have an important relationship to each other. Grasping 
is the acquisition of an object while determination identifies its final resting position. 
Grasping without determination is incomplete, as the lack of knowledge in how the 
object is positioned in the hand may preclude subsequent useful manipulations. 

Typically, a grasp planner is given as inputs a model of an object, its position, a 
kinematic model of the robot, and perhaps some task level information. Planners make 
a number of assumptions to make their task tractable, including simplified models of 
the fingertip contacts, frictional effects, and the object itself. Only then does planning 
a grasp become tractable. 

Even with simplifications, the planner’s task is a hard one. To begin, it must find a set 
of finger contacts that stably acquire the object. If the object is to be manipulated rather 
than just constrained, the planner must take this into account. This can be thought of as 
a task-level constraint. The question of reachability must also be considered. The robot 
must have a clear path from its starting position to the final grasp position. Collisions 
between the hand and the grasped object, and the hand and workspace obstacles must 
be considered. To date, no grasp planners have been able to address all these issues at 
once. 

To make the problem tractable, the grasping task is usually divided into easier sub¬ 
problems. Typically, the problem is decomposed into an analysis of stability, feasibility, 
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and reachability, each which is solved separately. Stability considers whether a partic¬ 
ular set of fingertip to face assignments grasp an object stably. Feasibility considers 
whether the hand’s kinematics can achieve the finger positions that the set of contact 
points requires. Reachability considers whether the robot’s arm can position the hand’s 
wrist at the required location. Solving each of these steps separately greatly simplifies 
matters, though it often leads to generate and test style algorithms. This approach can 
be wasteful if the workspace is very cluttered, where most stable and feasible grasps will 
not be reachable. 


2J t .l Stability Analysis 

The problem of analyzing and optimizing a given fingertip grasp has been extensively 
studied. Many criteria have been proposed for a good grasp. Good grasps should be 
stable. They should resist outside perturbation. A force closure grasp, one where the 
object is totally constrained by the contacts independent of the magnitude of the contact 
forces, is often considered ideal (Nguyen [77]). 

Many researchers have analyzed grasp stability. Kerr and Roth [63] examined the 
problem of selecting the internal grasping forces given three contact points. For a three 
fingered hand grasping an object with just its fingertips, only six of the nine finger¬ 
tip force unknowns are constrained by the basic Newton-Euler force and torque balance 
equations. The remaining three force components form the null space of internal grasping 
forces. Though these forces can be assigned arbitrarily, they suggested various optimiza¬ 
tion techniques for choosing them, based on a set of constraints. Barber et al. [5] used 
a quality measure based on the amount of friction necessary to keep the grasp from 
slipping. Jameson and Leifer [58] predicted the stability of fingertip grasping with both 
point contact and soft finger contact models. Li and Sastry [68] proposed a task oriented 
quality measure for evaluating a grasp. Park and Starr [80] proposed two indices for 
measuring the quality of a grasp. The uncertainty grasp index indicates how stability is 
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effected by position uncertainty. The task compatibility index represents how well suited 
the grasp is for the intended task. 

Static stability of a grasp is, of course, not the only issue that must be considered. A 
good grasp is one that firmly grips the object, and that can resists disturbing forces. This 
type of grasp, however, might preclude subsequent object motions that are necessary for 
completing the manipulation. This observation leads to grasp classification schemes such 
as that proposed by Iberall and Lyons [53]. For example, grasps can be coarsely grouped 
into power grasps and manipulatory grasps. A person’s normal grip of a hammer can be 
considered a power grip, while the grip used for holding a pencil is a manipulatory one. 

2-4-2 Grasp Synthesis 

Unlike much previous work, Nguyen [76] developed an analytical technique for synthesiz¬ 
ing force closure grasps, rather than just analyzing given grasps. Essentially, he found 
regions of contacts for the fingertips that constrain the object according to the force 
closure criterion. 

Jones and Lozano-Perez [62] studied the problem of grasp selection as a collision 
avoidance problem. Hence, their work concentrated on the question of reachability. An 
efficient representation of configuration-space was used. Likewise, Pertin-Troccaz [82] 
studied this problem using a configuration-space approach. Both groups only examined 
the case of two fingered grasps. 

Pollard [83] developed an entire system to plan a grasp, given an initial approach 
direction and a desired set of contact faces. Her system efficiently solved the stability, 
feasibility, and reachability problems using a combination of algorithms and heuristics. 

2-4-3 Pre-Shapes and Hand Primitives 

With a pre-shape, the basic form of the hand is limited to a small number of predefined 
configurations. Each pre-shape has a small number of parameters that are used to vary 
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the configuration. For example, a curl pre-shape with one parameter could be used wrap 
the hand around an object. In essence, pre-shapes are used to limit the configuration 
space of the hand. By using pre-shapes, the grasp planning process can be simplified. 

A pre-shape provides an initial configuration of the hand that is considered likely 
to result in a good grasp. This gives a starting point for choosing the fingertip to face 
assignments. Alternatively, pre-shapes can be used to define an acquisition strategy. 
The curl pre-shape can be used as a starting hand form. A grasping strategy could 
simply reduce the curl parameter until contact with the object has been made. 

An assumption made by many grasp planners is that only fingertip contacts are 
considered. Certain types of grasps fall, at least partially, into this category. Writing 
with a pencil is an example, though even here the palm provides important support for 
the implement. Certainly, a good power grasp would have far more object-hand contacts 
than just with the fingertips. The fingertip grasp simplification is often made because 
it makes stability analysis more tractable. 

Iberall et al. [52] and Stansfield [100] used knowledge based approaches for grasp 
planning and pre-shape selection. This type of approach has the potential advantage 
of being able to incorporate task-level specifications into the knowledge base. Tomovic 
et al. [103] synthesized grasps by matching the object to a small number of geometric 
primitives, and selecting a pre-shape based on the primitive. 

2-4-4 Acquisition Behaviors 

The problem discussed until now has been one of planning and analyzing a grasp of a 
known object at a known location. An important alternative scenario is one where the 
object to be grasped is not known, or is known but not at a pre-determined position, 
or both. This can be called object acquisition, to distinguish it from planned object 
grasping. We acquire objects as part of our daily repertoire of manipulations. Reach¬ 
ing into our pockets to pull out their contents is an example of such a manipulation. 
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Chammas [18], for example, addressed this problem. He found the geometric conditions 
necessary for form closure grasps of cylindrical objects. With his analysis, capture zones 
can be computed, where in a computed region a particular strategy is guaranteed to 
grasp the object. 

Robots are needed to perform acquisition tasks. NASA is interested in a tool retrieval 
system that can acquire free floating lost tools. For another space application, NASA is 
studying reliable systems for grasping beams. In some sense, the role of a parts feeder is 
to acquire object and to determine their position and orientations. Typically, a feeder is 
hard to design, and is specific to a particular part. They are built using various tricks, 
including vibratory bowls. A more general part alignment system might consist of a 
simple hand-like gripper along with a pose determination system. Not only would this 
approach give a more flexible system, but a single pose determination algorithm could 
work for many parts. 

Unlike grasp planning, object acquisition has mostly been studied using whole hand 
grasps. Contacts between the objects and the hand are not simply limited to the finger¬ 
tips. The hand is treated as a capture device, where its pre-shape and closing strategy 
are designed to maximize the chances of capturing the object. 

With both planned grasps and acquired grasps, there is always uncertainty as to 
where the object has finally rested on the fingers. Because of uncertainty, even the best 
planned grasp will not be executed as expected. By definition, acquired grasps have no 
knowledge of the object pose. This report address the problem of recovering an object’s 
pose after it has been grasped or acquired. 

2.5 Recognition 

There are a number of recognition problems that are studied in conjunction with hands. 
Lower level recognition includes the measurement of local contact properties, such as 
curvature and texture. Higher level recognition includes object identification and pose 
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determination. For completeness, this section overviews a variety of the recognition and 
determination problems, briefly discussing the sensing strategies and the algorithms that 
have been used to solve them. 

The distinction between low level (local) recognition and high level (global) recogni¬ 
tion is not only a distinction between the types of problems studied. Rather, it contrasts 
how the sensor data itself is viewed. One approach treats tactile sensors as miniature 
vision systems that can give an image of a small object that is being manipulated. Here, 
tactile sensors are thought best for recovering local object features, such as surface cur¬ 
vature. As an alternative, tactile and kinesthetic sensors can be thought of as providing 
global information, which is useful for global recognition problems. In both cases, the 
tactile sensors extract a small piece of information at the contact point. What differs is 
how the information is used. A local strategy, such as one for curvature detection, might 
require multiple sensor readings from a small region. The finger would perform a series 
of motions to obtain this information. A global strategy would combine simultaneous 
readings from many sensors, and use the information to detect a global property, such 
as the object’s pose. 

It is important to note that while low level recognition is almost always studied as 
the problem of interpreting tactile sensor data, high level recognition can be studied 
without such data. In particular, this report explores how pose determination can be 
performed using just kinesthetic sensors, along with the geometric constraints inherent in 
the problem. Methods which fully exploit kinesthetic sensing are desirable, as they take 
full advantage of the available data. The addition of tactile sensor data to these methods 
is certainly desirable, and could be used to improve their reliability and accuracy. 

2.5.1 Low Level Recognition 

Low level recognition measures local properties at the sensor contact point. A represen¬ 
tative set of work in this area is examined here, including the measurement of curvature, 
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texture, forces, and small features. 

Fearing [36] analyzed how a tactile sensor could be used to accurately measure local 
contact curvature from a set of strain measurements. The problem of combining together 
a number of contacts to estimate curvature was studied by Brock and Chiu [15] and 
Montana [72]. Driels [29] found the orientation of a line on a flat contact array. Ellis [30] 
examined the texture classification problem. He extracted features from a tactile image, 
much in the same way that a vision system would extract features from an optical image. 
He created a feature vector which could be used for recognition. Howe and Cutkosky [51] 
designed and studied a sensor for detecting slip. Bicchi [11] studied the problem of 
force-based sensing. Russell [88] and Siegel et al. [95] developed thermal sensors that 
can measure heat conduction properties. This is useful for material identification and 
slip detection. In one of the earliest active sensing systems, Hillis [47] used his sensor to 
distinguish a set of nuts and bolts. These parts were all small compared to the size of 
his sensor. Each part was classified according to three parameters: shape, bumps, and 
stability. A sensor equipped finger actively probed a part to ascertain its parameters. 

2.5.2 High Level Recognition 

High level recognition finds global properties of an object. A representative set of work 
in this area is examined here, including finding surface maps of object, model-based pose 
determination, and the scheduling of sensor motions. 

Kinoshita [64] recognized objects using a hand covered with simple binary sensors, 
using a pattern classification scheme. Likewise, Okada and Tsuchiya [79] recognized 
object using patterns from tactile sensors and the joint angles from a hand. Allen [2] 
built a surface map of an object from multiple local measurements. Gaston and Lozano- 
Perez [38] and Grimon and Lozano-Perez [42] studied model-based pose determination. 
They assumed sensors that return contact positions and normals, and performed an effi¬ 
cient search of an object pose interpretation tree using constraint propagation methods. 
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Snyder and St. Clair [98] used a similar approach in their recognition system. Similarly, 
Schneiter [91] studied an active sensing approach, and developed a method for schedul¬ 
ing sensor motions based on the recognition scheme of Grimson and Lozano-Perez [42]. 
Using their notion of an interpretation tree, Schneiter developed a scheme for schedul¬ 
ing sensor moves to remove ambiguities in pose interpretations. Ellis [31] studied the 
problem of how a robot should proceed when acquired sensory data is insufficient to 
recognize an object and to determine its pose. Luo and Tsai [69] developed a object 
recognition system that used a tactile array and a vision system. They found features 
such as contact moments, and used a decision tree to match object feature vectors. 

Tactile recognition of objects, from a theoretical standpoint, can be though of as a 
geometric probing operation. Skiena [97] studied this problem and extended the earlier 
work by Cole and Yap [23]. Cole and Yap defined a finger probe to be the first intersection 
point p between a line / and an object P. The line specifies the path that the finger takes 
when moving toward the object. In their work, it is assumed that absolute determination 
of object P is desired. That is, given a model of object P, bounds on the number of 
probing operations necessary to determine if the unknown object is P are developed. 
They found that 2 n finger probes are necessary and sufficient to verify a convex n—gon. 
Intuitively this is true because probing each vertex and edge in the object will determine 
its position. 
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3.1 Introduction 

This chapter examines a pose determination strategy based on searching an interpreta¬ 
tion tree of potential contact assignments. An exhaustive search is avoided by exploiting 
geometric constraints from the object and the hand, and by knowledge of the grasping 
strategy. The experiments described in this chapter suggest that an object usually fits 
into a hand shape only a small number of ways. Adding additional information, such 
as joint torques, further reduces the number of potential poses, often to just one. The 
algorithm is on-line, where the computations are done after the hand has completed its 
grasping operation. By carefully pruning the pose interpretation tree, search time is kept 
small. This work also suggests that tactile sensors may not be necessary for certain pose 
determination tasks. Dextrous hands should be able to perform certain useful sensing 
tasks with just basic joint position information. 

This research assumes that the type of grasps being attempted are whole hand grasps. 
These grasps are the one that humans commonly use, where many surfaces of the fingers 
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touch the object. Even in cases of fine motion manipulation, fingertip grasps alone are 
uncommon. As an example, take writing with a pencil. While motion of the fingertips 
that are in contact with the implement is a crucial part of the manipulation, other 
surfaces of the hand provide necessary support. Most hand control research to date has 
used fingertip grasps, where only fingertip surfaces touch the grasped object. As will 
be seen, whole hand grasps have far more information content than fingertip grasps. 
They provide a large number of geometric constraints that can be exploited by object 
recognition systems. 


3.1.1 Relevance of this Problem 

Pose determination is helpful for verifying that a planned grasp has been executed cor¬ 
rectly. An exploratory robot might use small set recognition and pose determination 
to gather information about its environment. The constraint-based pose determination 
method examined in this chapter helps understand how much additional sensor data is 
really required for this type of problem. 

3.1.2 Why Use Geometric Constraints? 

The recognition method studied in this chapter relies heavily on the geometric con¬ 
straints that are obtained from models of the hand and objects. There are several 
reasons why so much emphasis is placed on these geometric constraints: 

1. They are essentially free, since the kinematics of the hand and models of the objects 
are known. 

2. They are powerful and will greatly reduce the space of possible poses (see Grim- 
son [41]). 

3. They play a complementary role to sensor data, helping to confirm possible inter¬ 
pretations and resolve ambiguities, which results in a more robust system. 
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The shape of the hand provides many global clues as to how an object might be 
oriented. Imagine grasping an object that is much larger in one dimension than in the 
other. The separation distance between the fingers and an opposing thumb will probably 
rule out certain orientations of the object. For example, if two particular finger links 
are involved with a grasp, a simple distance constraint will determine which parts of the 
object can be placed between them. 

3.1.3 Why Use Just Joint Angle and Torque Data? 

Many potential haptic data sources can be used for input to a localization algorithm, 
including joint angles, joint torques, wrist forces, tactile data and visual data. When 
deciding if a data source should be employed in a recognition algorithm, one must 
consider the cost and difficulty of obtaining the data, among other factors. Importantly, 
minimal sensing recognition approaches allow a better understanding of the problem’s 
inherent constraints. A strategy that works well with limited data will only benefit from 
additional information if it is available. 

The methods described in this chapter utilize just joint angle and joint torque data. 
It is important to explore the full power of these particular data sources since they are 
so readily available. Almost all robots provide joint angle and joint torque sensors as 
part of their normal control system. A primary goal of this chapter is to investigate how 
much haptic information content is present in this data, since it is available essentially 
for free. 


3.1.4 Chapter Overview 

The following sections show how constraints are used for object pose determination. 
Section 3.2 provides an overview of previous work. Section 3.3 outlines the assumptions 
required for the solution given. The approach is discussed in Section 3.4. Section 3.5 
presents a number of simulations of the algorithm. Section 3.6 discusses the experimen- 
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tal setup and the results. Potential extensions of the approach for problems in three 
dimensions are presented in Section 3.7. Section 3.8 presents conclusions. 

3.2 Related Work 

The recognition strategies used in this chapter are most closely based on the work of 
Gaston and Lozano-Perez [38] and Grimson and Lozano-Perez [42]. They decompose 
recognition into an efficient search of an object pose interpretation tree using constraint 
propagation methods. A set of feasible object poses are generated, and then tested for 
validity using an additional set of verification constraints. While the approach described 
in this chapter uses a similar notion of an interpretation tree, it uses a different set 
of data sources. Their method relies on knowledge of contact locations and normals. 
This work assumes a much weaker set of sensor inputs. Importantly, no explicit contact 
sensor information is used. 


3.3 Assumptions 

The following assumptions are made in this chapter: 

• The hand has been modeled. 

• The objects have been modeled using polyhedra. 

• The grasped object is assumed to be in static equilibrium. 

• Hand joint angle sensor data is available. 

3.4 Approach 

This section describes the method used for determining the pose of a grasped object. The 
algorithm uses a generate and test paradigm, where candidate poses are hypothesized 
and then tested for validity. Thus, this description is broken into two components, the 
generator and the tester. 
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Figure 3.1: Overview of the constraint-based recognition method. Pose candidates are gen¬ 
erated from the hand shape. A tester removes the candidates that are inconsistent. 

The algorithm described is for planar hands and planar, polyhedral objects. As will 
be seen, the algorithm can also be used for three-dimensional polyhedral objects that 
are resting on a table on one of their faces. A full three-dimensional extension of this 
method should also be possible, and is discussed in Section 3.7. 

For a generate and test method to be useful, the generator must come up with 
a reasonably small set of candidates for testing. The generator must perform its job 
efficiently, as there is a large space of possible solutions for it to traverse. A useful 
generator must also be complete, in that it should never prune the correct solution from 
the interpretation space (Grimson and Lozano-Perez [42]). The most important criterion 
for the tester is for it to incorporate all additional available constraints into its tests. 
It need not be particularly efficient, since it should have relatively few candidates to 
evaluate. 

A brief overview of the algorithm is presented here, giving a road map for the re¬ 
mainder of this section. The algorithm is present as two distinct modules, the generator 
and the tester: 
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1. Pose Generation: 

(a) Find consistent object vertex pairs that can be placed on finger link pairs, 
using a distance constraint. 

(b) Find consistent object vertex triplets that can be placed on finger links, using 
the vertex pairs. 

(c) Determine the orientation of the object vertex triplet triangle. It will be 
shown that each vertex triangle can have only a finite number of placements 
on the finger links. 

(d) Find the orientation of the entire object, based on the orientation of the 
vertex triplet triangle. 

2. Pose Testing: 

(a) Verify that the generated object pose and hand are free of intersections. 

(b) Verify that the generated object pose is consistent with the joint torque sensor 
data. 

Figure 3.1 diagrams the method, and shows the flow of information, from inputs to 
outputs. 

3-4-1 Constraining an Object’s Position 

The problem of geometrically fixing the position of an object in a hand can be decom¬ 
posed into the problem of assigning object vertices to finger segments. There are two 
types of assignments that must be considered (see Figure 3.2): 

1. Three object vertices can be placed on three finger segments (Figure 3.2 A). 

2. Two object vertices can be placed on one finger segment, and another vertex placed 
on a second segment (Figure 3.2 B). 
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Figure 3.2: Two types of grasping constraints. Object A is grasped with three vertices on 
three finger segments. Object B is grasped with two vertices on one finger segment and another 
vertex on a second segment. The object vertex triangle is shown for each grasp. 

In both these cases, the position of the object vertex triangle, as drawn in Figure 3.2, is 
fixed with respect to the fingers. That is, specifying either of the two classes of contacts 
fixes the position of the object. Note that this method does not just place an object that 
is triangular in shape. Rather, the triangle formed by any three of an object’s vertices 
is placed, which constrains the position of the object as a whole. This placement cannot 
be made when any two of the finger segments are parallel, as the vertex triangle is not 
fully constrained. 

The object initially has three degrees of freedom, one rotational and two translational. 
By placing an object vertex on a finger edge, one degree is constrained. Thus, assuming 
vertex contacts, at least three vertices must be placed to fully constrain the object. 
Placing an edge of the object on an edge of the finger results in two degrees of constraint. 
Thus, placing one edge and one vertex, or two edges, fully constrains the object’s position 
to a discrete set, unless the case is degenerate (e.g. parallel edges). 

Note that the constraints discussed are different from grasping constraints. That 
is, just because an object’s position is specified by its finger contacts, it may or may 
not be stably grasped. To determine if a grasp is possible, one must consider other 
factors, including the type of contact. In the case of a point contact, one must consider 
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if it is with or without friction. For the purposes of this recognition strategy, these 
considerations are ignored. Instead, the geometric constraints necessary to specify an 
object’s location are considered, not the contact conditions necessary for the grasp to 
be stable. 


3-4-2 Pose Generation 

The pose generation process is based on an interpretation tree, as developed by Gaston 
and Lozano-Perez [38]. The interpretation tree represents all possible assignments of 
object vertices to finger edges. Nodes in the tree represent fingers segments and links 
represent object vertices. Thus, a node and a link together represent an object vertex 
to finger segment assignment. The depth of the tree is equal to the number of object 
vertices. Each node has one child for each finger link segment, plus a special no-contact 
node, as used by Grimson and Lozano-Perez [42]. A fully expanded interpretation tree 
represents all possible assignments of object vertices to finger edges, without regard to 
the feasibility of the assignments. Only certain branches of the tree correspond to feasible 
assignments. The trick to using an interpretation tree efficiently is to generate only its 
feasible nodes. The gist of this method is how to selectively perform this expansion. 

An object grasped by a hand, and the corresponding interpretation tree, is shown 
in Figure 3.3. The hand has three links, and the object has three vertices. Only the 
feasible portions of the tree are diagrammed, where feasible means assignments that 
satisfy a particular set of constraints. Branches representing placements that are not 
feasible have been omitted from the drawing. Later in this section, the methods used 
for determining the branches that should be pruned are discussed. For now, just assume 
that such pruning methods exists. 

The actual contact assignment between the fingers and the object, as diagrammed in 
Figure 3.3, corresponds to the left most branch in the interpretation tree. This path is 
highlighted. Reading from the root to the final leaf, the contacts are V\ —► F\, V 2 —* 
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Figure 3.3: A pose interpretation tree for a grasped object. The object is composed of three 
vertices. The hand has three finger segments (two fingers and a palm). A branch from a parent 
node (object vertex) labeled with a finger indicates a placement of that vertex on that finger. 
This tree only contains assignments of one vertex to each finger segment. 
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and V 3 —► F 2 . Other branches in tree represent other feasible contact assignments. For 
example, the second to left most branch represents the contact assignment V\ —► F 2 , 
and V 2 —► F 3 . In this case, V 3 is not in contact with any of the finger links. 

In general, a connection from a link to a node indicates an assignment of an object 
vertex (the link) to a finger segment (the node). A connection from a node to a no¬ 
contact node indicates that the object vertex is not in contact with that finger segment. 
A path from the root of a tree to any intermediate non-terminal node represents a partial 
assignment of an object’s vertices to finger segments. A path from the root of a tree 
to a leaf indicates a full assignment of all object vertices to finger segments. Thus, any 
node in the tree represents a potential contact assignment. The path from the root to 
that node specifies the particular assignment. 

Any node that has at least three vertex-finger assignments constrains the position of 
the object, in a geometrical sense. The role of the pose generator is to efficiently build 
this interpretation tree and find such candidates, and to pass them to the verifier. The 
next sections discuss the constraints that are used to prune inconsistent paths from this 
tree. 

Finding Consistent Vertex Pairs 

The first of the two tree pruning constraints used is called the vertex pair constraint. 
A pair of object vertices can be placed on a pair of finger edges when there is at least 
one place where the length of a line drawn between each finger is equal to the distance 
between the vertices. Figure 3.4 shows where such a line can be drawn between two 
particular edges. All vertex-finger assignment pairs in a path in an interpretation tree 
must satisfy this criterion. If an assignment fails this test, the path in the tree is pruned. 
An efficient way to compute the ranges of position on each finger where such an fixed- 
length line can be drawn is now presented. 

Let ri be the ray of finger edge F\ and r 2 by the ray of finger edge F 2 , as shown in 
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Figure 3.4: Possible placements of a vertex pair on finger edges. This diagram shows the 
family of positions where a pair of object vertices (shown as a fixed length line) can be positioned 
between two finger edges. 



Figure 3.5: Distance constraint coordinate system. A complex coordinate scheme, shown 
here, is used to compute the distance constraint. 
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Figure 3.5. The range of values of rj such that a line of fixed length D can be drawn 
from r\ to r 2 is desired. To begin, note the equations for the endpoints of the edges and 
the fixed length line: 


Po = r 0 e t0 ° 

Pi = P 0 + r x e x ' 
Pi = Pi + De' 6 
P, = r,e is >. 


(3.1) 

(3.2) 

(3.3) 

(3.4) 


P 2 can also be written as 


P, = Pi + rie-’P 


(3.5) 


By substitution using the above equations, r 2 can be obtained in terms of rq 


r 2 = r 0 e <(flo-tfa) + r ie i{ei ~ e2) + De i(e ~ 62) - r 3 e i{H ~ e2) . 


(3.6) 


Expanding, and setting the complex part of the solution to 0 gives 

0 = r 0 sin (9 0 - 9 2 ) + r i sin (#i — 0 2 ) + D sin (0 - 9 2 ) - r 3 sin (9 3 - 9 2 ). (3.7) 


Solving for 9 , 


„ . /r 3 sin(0 3 - 9 2 ) - r x sin (0 X - 9 2 ) - r o sin(0 o - ^)\ . a 

9 = sin I-—-I + ^2 

and noting that sin -1 x is defined only for — 1 < x < 1 we obtain: 

_ r 3 sin ( 9 3 - 9 2 ) - r x sin (fl x - 9 2 ) - r 0 sin ( 9 0 - 9 2 ) < 

- D 

Thus we conclude that r 1 can take on values in the range: 

D-K ^ -D-K 

> r\ > — 


where, 


sin (0 X — 9 2 ) sin (0 X — 9 2 ) 

I< = r 0 sin (9 0 - 0 2 ) - r 3 sin (9 3 - 9 2 ). 


(3.8) 


(3.9) 


(3.10) 


(3.11) 
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region of overlap 



Figure 3.6: Consistent vertex pairs. Three object vertices placed on three finger edge segments 
are shown. Vertex Vj from both pairs shares a common edge, and overlap. 

Equation 3.10 gives the range of positions along finger edge F\ where a line of length 
D can be drawn to finger edge F 2 . For convenience, this equation can be represented by 
a function R: 

R(Fi,F 2 ,D) — F[, (3.12) 

where F[ is the portion of F\ where the line of length D can be placed. If ||.E 1 , || = 0, 
then the placement is not possible. Thus, when given a pair of vertex-finger assignments, 
Equation 3.12 provides a fast check for distance consistency. From the viewpoint of the 
interpretation tree, this test validates the consistency of a link. 

Finding Consistent Vertex Triplets 

Two pairs of vertex-finger assignments are considered consistent if one of the vertex- 
finger assignments from each pair is the same, and if they overlap in placement on the 
common finger. This can occur between two adjacent links in a path in an interpretation 
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tree. The diagram of a hand and potential object vertex placements shown in Figure 3.6 
helps explain this. 

Two vertex pairs are shown, {Vj, V 2 } and {Vi, V3}, where V\ is on finger F\, V 2 is on 
finger F 5 and V 3 is on finger F 4 . The lines E i2 and F 13 are the equal length distances 
between each pair of object vertices, where E\ 2 = || Vj V 2 II and F 13 = || Vi V 3 1|. In this 
case, the vertex triplet condition is met because for both pairs, vertex V\ is on finger iq, 
and there is an overlap of the placement range of the shared vertex, V\. 

More formally, the overlap range can be computed using the vertex-pair constraint, as 
defined in Equation 3.12. First, the placement range for each pair of edges is computed: 


Ka 

= R(Fi,F 5 , E 42 ) 

(3.13) 

F[ b 

= R(Fi,F 4 ,Ei 3 ), 

(3.14) 


where F[ a and F[ b are the subranges of F\ where each placement can be made. The 
overlap range can be found be computing their intersection, 

F[ = F;„ n F[ t , (3.15) 

where F[ is the desired part of edge F\ where the common placement can be made. 

Finally, the subranges of F 5 and F 4 that are compatible with this placement are 
easily computed using function R: 

F' = R(F{,F s ,E 12 ) (3.16) 

E 4 = R(F{,F 4 ,E l3 ). (3.17) 

Thus, this test, when given two vertex-finger pairs that share a common placement, 
provides a fast consistency check. While the previous constraint tested the consistency 
of a single link in an interpretation tree, this test is used to verify the consistency of 
a pair of links. The next section will describe in more detail how the vertex pair and 
vertex triplet constraints can be used to efficiently build an interpretation tree. 
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Efficiently Building the Interpretation Tree 

At this point we describe how to efficiently build an interpretation tree using the vertex 
pair and vertex triplet constraints. To build the tree, each node is expanded, starting 
from the root, in a breadth first search. A node will have one child for each of the finger 
segments in the hand, along with an additional child for the no-contact case. Each 
child will be tested for consistency using the constraints developed in the previous two 
sections, and is pruned if it does not pass the tests. 

The vertex pair represented by the link from the new node to its parent is considered. 
Pair elements on different finger segments are tested using the vertex pair constraint. If 
the pair elements are on the same segment, the distance between the vertices must simply 
be less than or equal to the length of the segment. If the pair elements are consistent, 
the placement ranges are noted, and the link is added to the tree. If the new node is 
inconsistent, the tree is pruned at that node. The vertex triplet test is then applied to 
any groups of three or more vertex-finger assignments in the path. This process finds 
the new subranges of each finger segment that are consistent with the placements on 
the path. If any of the subranges are null, the new node is pruned. If the tests are all 
successful, any three vertex-finger assignments on the path can be used to compute an 
object pose. 

To better understand how this process works, consider a path from an interpretation 
tree that contains three assignments: Fi — ► Vj, F 2 —> V 2 , and F 3 — > V 3 . The next few 
paragraphs will detail how these assignments are added to the tree. 

The first assignment of F\ —► is arbitrarily made. Assume that a new leaf is 
being evaluated that assigns F 2 —► V 2 . Thus, the path from that leaf to the root assigns 
F\ —> Vj and F 2 —* V 2 . To test the new assignment, the placement subrange of F 2 is 
computed: 

F 2 = R(F u F 2 , IIVVHII). (3.18) 

If ||f;|| > 0, the node is added to the tree. The subrange F 2 is stored at the node for 
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future use. 

Assuming that the assignment of F 2 —► V 2 was validated, let us evaluate a new 
assignment of F3 — > V3. First, a vertex pair test between the new leaf and its parent is 
performed, using the parent’s current subrange: 

F' = f?(F',F 3 ,||T^||). (3.19) 

If this test passes, the vertex triple constraint can be applied between the three edges in 
the path, where the assignment of F 2 —> V 2 is common between them. Essentially, the 
range intersection on F 2 between the assignment of F\ —► Vi and F3 —> V3 is computed. 
If the length of this intersection is greater than zero, the triple is accepted. This process 
continues until all nodes have been expanded or pruned. 

By using these pruning tests, only a small portion of the full interpretation tree is 
usually generated. For objects that lack symmetry most vertex-finger assignments are 
not consistent with the tests. It is important to note that this pruning operation will 
never remove a valid solution that has the types of contacts that are being considered. 
An actual object placement must be consistent with the tests, and will be found in this 
generation stage. 

Computing the Orientation of the Object 

The tree generation from the previous section found two types of vertex-segment assign¬ 
ments. In the first case, three vertices were placed on three different edges. In the second 
case, two vertices were placed on one edge, and the third was placed on a different edge. 
This section explains how the pose of the object is recovered from these assignments. 

First, consider the case of triplets of vertex to finger segment assignments. The 
triangle formed by each vertex triplet is fully constrained by the finger edge segments. 
Thus, the position and orientation of the triangle, and hence the object, can be directly 
computed. As will be shown, there are potentially four solution classes, and each can 
have two solutions. 
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Figure 3.7: Solution classes for vertex triplets placed on finger segments. The finger seg¬ 
ments are extended into the dotted lines. Each of the labeled classes can potentially contain a 
placement for the object vertex triangle. 

Figure 3.7 diagrams the solution classes. The three finger edge segments being con¬ 
sidered are extended to form the dotted lines. The four classes that are created by the 
intersection of the lines are labeled. The goal of this section is to determine the orien¬ 
tation of a triangle where each of its vertices are constrained to be on one of the three 
lines. Different solutions for the triangle orientation can be found for each of the labeled 
classes. Since the triangle is actually being placed on the finger edge segments, and not 
on an infinite line, solutions need be computed only for the classes where the actual 
finger segments are present on all three of its boundaries. In addition, after a potential 
solution has been found, the triangle vertices must be tested to insure that they fall on 
portions of the finger edges. 

An example of a grasped objects and the corresponding solution classes for three 
different sets of triplets of links is shown in Figure 3.8. The object is shown at the top 
of the figure. Three possible sets of constraint edges and the corresponding solution 
classes are shown below the object. In the first set, the three finger segments that 
are extended with dotted lines are being considered. The extended lines form the four 
solution classes. Class one, three, and four do not have any portion of the actual finger 
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Figure 3.8: An object vertex triangle and its potential solution classes. The upper figure 
shows a sample object gripped by a two fingered hand. The lower figures shows the triangle 
formed by three of the object vertices and the classes formed by the finger edges. 
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Figure 3.9: The notation used for computing the orientation of a three-edged triangle. The 
inner triangle is formed between the three object vertices. The outer lines are formed by the 
finger edges. The highlighted portion of the lines indicate the location of the finger segments. 

segments extending into them, so the object cannot be positioned in them. Class two 
has portions of the finger segments extending onto all three of the lines, and thus can 
potentially contain the object (and in fact, it does). 

For clarity, a brief recap may be helpful. At this point a potential assignment of 
object vertices to finger edge segments has been hypothesized. The pose of the object 
that would result from this assignment is what is being computed. There are two 
possible object poses for each of the four potential solution classes. A solution class 
must be considered if part of each finger segment extends into the edges that bound the 
class. What remains to be shown is the actual computation necessary to recover the 
object’s pose in a particular solution class. 
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First, consider the case where three object vertices are placed on three finger edges. 
Assume that the three finger edges are joined as shown in Figure 3.9. The orientation of 
the object triangle enclosed by the edges must be found. Note that while the notation 
used is defined for solution class one (see Figure 3.7), the analysis is the same for the 
other three solution classes. As shown in Figure 3.9, 0j, # 3 , and V 1 V 3 are givens. A 
solution for ai and x\ is desired. By inspection we obtain 



. „ h sin c*i 

tan = --- 

X\ — COS C*i 

(3.20) 


l 3 sin a 3 

tan 0 3 = - - -, 

x 3 - l 3 cos a 3 

(3.21) 

where xi, x 3 , oci , and 

a 3 are unknown. In addition, we note that 



«i + £*3 + 7i = ^ 

(3.22) 


^1 + ^3 = KHi 

(3.23) 

where 71 , and Vi V 3 are known. Solutions for aj and x x can now be found. 

Equations 3.20 and 3.21, we obtain 

Solving 


A cos ai + B sin a\ = C 

(3.24) 

where, 



A = 

li tan 0 3 + l 3 cos (x — 71 ) tan 0 3 + l 3 sin (w — 71 ) 

(3.25) 

B = 

h tan 03 . . . . . . . . 

—--- 1 -1 3 tan 0 3 sin (tt - 71 ) - l 3 cos ( 7 r - ji) 

tan u\ 

(3.26) 

C = 

(x! + x 3 ) tan 0 3 . 

(3.27) 

Let 

<j> = arctan ( B , A) 

(3.28) 


r = \/A 2 + B 2 . 

(3.29) 



(3.30) 
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Figure 3.10: The notion used for computing the orientation of a two-edged triangle. The 
two finger edges are joined at vertex O. The object triangle is placed between them as shown. 

we note that Equation 3.24 can be written as 

r cos (o^ — <j>) = C, (3.31) 


which gives 



(3.32) 


Finally, x\ can be obtained from Equations 3.20 and 3.32: 

x i = h + cos 7 i) . (3.33) 

Next, consider the case of two vertices assigned to one edge, and the third assigned to 
a different edge. Again, the position of the triangle formed by the three object vertices 
is fully constrained by the two finger edges. Finding the object position is much simpler 
in this case because the line formed by connecting two of the vertices is known to fall 
on one of the finger segments. 

If one object vertex is assigned to one finger edge, and an object edge to another 
finger edge, the orientation of the resulting triangle is easily obtained. Assume that the 
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two finger edges are extended to lines that intersect at point O, as shown in Figure 3.10. 
The position of the triangle can be computed from the following equation: 

d = x 2 -x i = / 3 - cos 7i) • (3-34) 

3-4-3 Pose Testing 

The previous section described how to generate possible pose candidates. This sec¬ 
tion describes how to verify that the postulated candidates are reasonable. Verification 
is necessary because the constraint-based procedure for generating poses does not uti¬ 
lize all information available. Rather, the generator uses the constraints that are both 
computationally inexpensive and that have adequate pruning power. The verifier uses 
the additional information available from geometric and from grasp acquisition strategy 
constraints for further pruning of the candidates. 

Any candidate poses that pass the verification tests are accepted. All others are 
rejected. Hopefully, only the candidate that corresponds to the object’s true position 
will remain. Experiments conducted in subsequent sections of this chapter will show 
how well this method performs. The next two sections describe the verification tests in 
more detail. 


Geometric Intersection Test 

The object candidates can potentially intersect the hand, as shown in Figure 3.11. The 
generator simply places triangles on finger segments. From the triangle, the placement 
of the entire object is computed. Nothing prevents parts of the object from intersecting 
the hand. The pose tester computes the intersection of the object and the hand, and 
will reject the candidate if the space is not null. From an implementation standpoint, a 
threshold is used to select the intersection tolerance that is acceptable. 
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Figure 3.11: Geometric verification constraint. The object triangle was placed by the algo¬ 
rithm, resulting in the object pose shown. Since the object intersects the hand, it is rejected. 



Figure 3.12: Joint torque verification constraint. Grasp A cannot occur because all joints 
were programmed to move to a torque limit. Joint 4 would not be at a limit for this grasp. 
Rather, the hand shape that would result is shown in grasp B. 


Grasp Acquisition Strategy Test 

A grasp acquisition strategy can be selected that provides additional information for pose 
determination. For example, the move-until-contact strategy curls each joint forward, 
until motion of the joint is no longer possible. A joint torque sensor on the robot may 
be required for implementing such a strategy. As an alternative, the strategy can be 
designed into the mechanics of the robot, as Greiner [40] has done with her device. 

The diagram shown in Figure 3.12 is used to help understand how this strategy 
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can verify pose candidates. The move-until-contact strategy provides a guarantee that 
all joints are constrained from further motion. In this figure, grasp A violates this 
constraint, as joint 4 is capable of motion. The distal link is free to move until it has 
collided with the object as shown in grasp B. Thus, if the pose shown in grasp A was 
generated, and if a move-until-contact grasping strategy was used, the pose would be 
rejected. 

In general, the grasp acquisition strategy test is implemented by simulating the 
actual grasping strategy. For each candidate pose generated, the grasping strategy is 
simulated. If the resulting hand shape matches the actual hand shape, the pose is 
accepted. Otherwise, it is rejected. It is important to note that to use this constraint 
successfully, a good grasp simulator must exist. If the simulator produces erroneous 
results, the test could reject the correct pose, or accept incorrect poses. 

In general it is hard to create a good grasp simulator. For certain strategies, including 
move-until-contact, the simulation process is easier. To further reduce the chances of 
incorrect simulations, the following procedure was used. Rather than to perform a 
full simulation of a move-until-contact grasp, the resulting grasp and pose were simply 
analyzed. Each joint, from distal to proximal, was moved forward a small amount. The 
joint’s link was then tested for collision with the object. If a collision did not occur, the 
pose was considered to be in invalid. The process was repeated for all joints. 

3.5 Simulations 

This section examines the performance of the pose determination algorithm on simulated 
grasps. The grasp simulator uses a move until contact grasping strategy. The joints on 
a finger, from proximal to distal, are moved forward, until the joint’s link makes contact 
with the object, or until it reaches a limit. The number of joints, and their length, can be 
varied. The simulator supports single or double jointed fingers. For these runs, double 
jointed fingers were used. Actual robotic hands come in both flavors. The Salisbury 
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hand, for example, is double jointed. The Utah-MIT hand is not. 

By using simulations, a large number of trials can be performed in a systematic 
manner. The problems caused by poor kinematic models of the robot are also avoided, 
which is helpful for initial testing. Though the simulations are useful, they cannot 
substitute for experiments using actual hardware. Such experiments are described in 
the next chapter. 

For each of the simulations, a table summarizing the computations performed is 
presented. The terms used in the table are described here: 

• Vertices are the number of vertices in the object. 

• Finger segments are the number of finger segments (links), including the palm. 

• Expanded nodes are the number of nodes in the interpretation tree that were ex¬ 
amined. 

• Expanded paths are the number of paths in the interpretation tree that were ex¬ 
amined. 

• Placement paths are the number of paths in the tree that were generated. 

• Full tree paths are the number of paths in a fully expanded interpretation tree. 

• Generated poses are the number of poses generated, both those verified and those 
rejected. 

• Verified poses are the number of poses that passed both verification tests. 

Simulation 1: For this simulation, a two-jointed hand was used. This is the minimal 
case, where only two numbers are being provided to the pose generation module. The 
object has five vertices, and is small enough to be enclosed by the hand. Table 3.1 
summarizes the simulation results. Due to the rather small size of the problem, a sig¬ 
nificant percentage of the interpretation tree’s paths were expanded. Only one of the 
generated poses was verified, as shown in Figure 3.13. This pose corresponds to the 
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Table 3.1: Simulation 1: Summary of the interpretation tree. 



Figure 3.13: Simulation 1: generated and verified pose. This pose corresponds to the correct 
grasp of the object. 

object’s grasped position. The entire set of generated and rejected poses are shown in 
Figure 3.14. All these poses were rejected because they are have geometrical inconsis¬ 
tencies. Note that the middle pose on the second row is a fairly good fit, and with a 
larger “slop” factor, would have been accepted. 


Simulation 2: As with the previous simulation, a two-jointed hand was used for this 
trial. In this case, a more complex object, with eight vertices, was grasped. The object 
is larger in size than the object from previous trial. Table 3.2 summarizes the simulation 
results. As the table indicates, the larger number of vertices results in an increase in 
the potential vertex placements, though the consistent placements remain rather small. 
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Figure 3.14: Simulation 1: generated and rejected poses. The entire set of generated and 
rejected poses are shown here. 

Three of the generated poses were verified, two of which are shown in Figure 3.15. The 
first pose corresponds to the object’s actual position. A small selection from the 105 
generated and rejected poses are shown in Figure 3.16. The first pose, in the upper-left 
corner, was accepted by the geometric consistency check, but rejected by the joint torque 
constraint check. 



60 


Chapter 3 Constraint-Based Pose Determination 


vertices 

8 

finger segments 

3 

expanded nodes 

17,643 

expanded paths 

11,776 

placement paths 

1,664 

full tree paths 

87,381 

generated poses 

108 

verified poses 

3 


Table 3.2: Simulation 2: Summary of the interpretation tree. 




Figure 3.15: Simulation 2: generated and verified poses. The first pose corresponds to the 
object’s position. One other consistent pose was also found. 

Simulation 3: For this simulation, a two-jointed hand was used again. This object had 
four vertices. Table 3.3 summarizes the simulation results. As in a previous example, 
because the object is rather small, a significant percentage of the tree paths were exam¬ 
ined. Later examples will better show the power of the pruning operations. Figure 3.17 
shows the two poses that were generated and accepted. Figure 3.18 shows the poses that 
were generated and rejected. All rejected poses relied on the joint torque test, as their 
geometry is consistent with the hand shape. 
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Figure 3.16: Simulation 2: generated and rejected poses. A selection of poses from the 105 
rejected ones are shown here. Note that the upper-left pose was rejected by application of the 
joint torque constraint. 
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vertices 

4 

finger segments 

3 

expanded nodes 

211 

expanded paths 

146 

placement paths 

46 

full tree paths 

341 

generated poses 

4 

verified poses 

2 


Table 3.3: Simulation 3: Summary of the interpretation tree. 



Figure 3.17: Simulation 3: generated and verified poses. The left pose corresponds to the 
object’s position. The right pose is also totally consistent with the data. 




Figure 3.18: Simulation 3: generated and rejected poses. The entire set of generated and 
rejected poses are shown. All these poses were rejected by applying the joint torque constraint. 
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vertices 

8 

finger segments 

3 

expanded nodes 

9,121 

expanded paths 

5,776 

placement paths 

1,066 

full tree paths 

87,381 

generated poses 

58 

verified poses 

2 


Table 3.4: Simulation 4 • Summary of the interpretation tree. 



Figure 3.19: Simulation 4 : generated and verified poses. The right pose corresponds to the 
object’s position. The left pose is also totally consistent with the data. 

Simulation 4 : For this simulation, a two-jointed hand was again used. This object 
had eight vertices. Table 3.4 summarizes the simulation results. Two object poses were 
consistent with the data, and were accepted by both verification tests, as shown in 
Figure 3.19. Figure 3.20 shows the two poses that were generated but rejected. 

Simulation 5: For this simulation, a four-jointed hand was used. The object had five 
vertices. Table 3.5 summarizes the experiment’s results. The object’s pose was correctly 
recovered by the algorithm, as shown in Figure 3.21. Some of the verified but rejected 
poses are shown in Figure 3.22. 




















Figure 3.20: Simulation 4'- generated and rejected poses. A sample of the 56 rejected poses 
are shown. 
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vertices 

5 

finger segments 

5 

expanded nodes 

1,381 

expanded paths 

972 

placement paths 

360 

full tree paths 

9,331 

generated poses 

5 

verified poses 

1 


Table 3.5: Simulation 5: Summary of the interpretation tree. 



Figure 3.21: Simulation 5: generated and verified pose. The correct object pose was recovered 
by the algorithm. 


vertices 

6 

finger segments 

7 

expanded nodes 

7,791 

expanded paths 

5,432 

placement paths 

1,858 

full tree paths 

299,593 

generated poses 

32 

verified poses 

1 


Table 3.6: Simulation 6: Summary of the interpretation tree. 
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Figure 3.22: Simulation 5: generated and rejected poses. Some of the rejected poses are 
shown here. 



Figure 3.23: Simulation 6: generated and verified pose. 

Simulation 6: For this simulation, a six-jointed hand was used. The object had six 
vertices. Table 3.6 summarizes the results. In this case, the larger number of joints and 
vertices resulted in a potentially large tree size. Nonetheless, only a small number of 
the full tree’s paths were examined. In addition, the number of generated poses was 
small. This trial provides a good example of how effective this method is for reducing 
the expensive computations required for finding potential pose candidates. Just one 



§ 5.5 Simulations 



Figure 3.24: Simulation 6: generated and 
poses are shown. 
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poses. A sample of some of the 31 rejected 
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vertices 

9 

finger segments 

9 

expanded nodes 

84,844 

expanded paths 

48,406 

placement paths 

8,746 

full tree paths 

1,111,111,111 

generated poses 

158 

verified poses 

2 


Table 3.7: Simulation 7: Summary of the interpretation tree. 



Figure 3.25: Simulation 7: generated and verified poses. The pose on the left corresponds 
to the object’s actual position. 

pose was generated and verified, as shown in Figure 3.23. Figure 3.24 shows a few of 
the rejected poses. 

Simulation 7: For this simulation, an eight-jointed hand was used. The object had 
nine vertices. Notice, from Table 3.7, that while the full interpretation tree is huge, 
the number of consistent paths that were found remained small. Two of the generated 
poses were verified, as shown in Figure 3.25. The first of those poses corresponds to 
the object’s grasped position. A sample from the 156 generated and rejected poses are 
shown in Figure 3.26. 
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Figure 3.26: Simulation 7: generated and rejected poses. A sample of some of the rejected 
candidates. 
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vertices 

9 

finger segments 

9 

expanded nodes 

137,871 

expanded paths 

84,717 

placement paths 

10,485 

full tree paths 

1,111,111,111 

generated poses 

250 

verified poses 

5 


Table 3.8: Simulation 8: Summary of the interpretation tree. 






Figure 3.27: Simulation 8: generated and verified poses. The pose in the upper left corre¬ 
sponds to the object’s actual position. The other poses are close verification matches. 
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Figure 3.28: Simulation 8: generated and rejected poses. A sample of some of the 2j5 
rejected poses are shown. 
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Figure 3.29: Photograph of the Utah-MIT hand. 


Simulation 8: For this simulation, an eight-jointed hand was used. The object had 
nine vertices. Notice, from Table 3.8, that the number of consistent paths that were 
found remains small. In this run, more tolerance was allowed in the verification stage, 
resulting in a few additional matches. The upper-left pose corresponds to the object’s 
position when it was grasped. Figure 3.28 shows some of the rejected poses. A total of 
250 poses candidates were generated in this run. 
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3.6 Experiments 

The pose determination method described in this chapter was tested on the Utah-MIT 
hand (Jacobsen et al. [57]), using polyhedral objects. A photograph of the hand is 
shown in Figure 3.29. In general, the experiments confirmed the results obtained from 
the simulations. The constraint-based pose determination algorithm usually found the 
object’s pose, or at worst found a small set of consistent poses which included the correct 
pose. This chapter describes the experimental procedures used, and presents a few of 
the trials. 

Since the two dimensional recognition algorithm was used, the test objects were sym¬ 
metric along the grasping axis. The illustrations that follow are cross sections through 
this axis. For simplicity, the objects were oriented in a manner to facilitate easy grasp¬ 
ing. The three dimension case, using the added constraint that the object were resting 
on a table-top, was not performed due to limitations in the experimental hardware. 

To facilitate the joint torque constraint, the hand was programmed to close on the 
objects by applying a fixed torque to all its joints. This commanded the hand to wrap 
around the object with all joints moving forward until contacts were made, or until a joint 
limit was reached. When the grasp completed, the joint angles were obtained and passed 
to the recognition module. See Appendix A for a detailed description of the Utah-MIT 
hand setup, its computational architecture, and its interface to the recognition system. 

Trial 1: An object grasped by the Utah-MIT hand is shown in Figure 3.30, along with 
the position of the hand’s fingers that resulted from a grasp of that object. The view 
of the hand is a cross-section of its xy plane, where the thumb is to the left and the 
middle finger is on the right (see Narasimhan [74]). Note that joint one of the thumb, 
the proximal joint, is elevated above the palm plane. Both the thumb and fingers have 
three joints that move in the xy plane. 

For this grasp, a total of 17 possible placements of the object were found prior to the 
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Figure 3.30: Trial 1: grasped object. The object and the hand shape. 



Figure 3.31: Trial 1: generated and verified pose. Only the actual grasp of the object was 
generated and verified by the algorithm. 

verification stage. When the verifier applied the intersection and torque constraints, all 
of the incorrect poses were eliminated, leaving just the correct pose. Figure 3.31 shows 
the correct pose, as found by the algorithm, while Figure 3.32 shows the object poses 
that were eliminated by the verifier. 

Trial 2: A second object grasped by the Utah-MIT hand is shown in Figure 3.33. Again, 
the system obtained the correct pose of the object, as shown in Figure 3.34. In this case, 
however, the verifier accepted two poses that were not correct, as shown in Figure 3.35. 
As can be seen in the figures, the verified but incorrect poses have symmetries that 
allowed the object to fit into the hand and still meet all required constraints. Six 
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Figure 3.32: Trial 1: generated and rejected poses. These poses were found by the generator, 
and rejected by the verifier using either the joint torque of intersection constraints. 




Figure 3.33: Trial 2: grasped object. The object and the hand shape. 
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Figure 3.34: Trial 2: generated and verified pose. The actual pose of the object was found 
by the algorithm. Note that two other poses that were entirely consistent with the constraints 
were also found. 




Figure 3.35: Trial 2: other generated and verified poses. These poses were also found by 
the algorithm. Though they are consistent with all constraints, they do not correspond to the 
actual pose of the object. 

additional poses were also found, shown in Figure 3.36. These, however, were eliminated 
by the verifier. 


3.7 Simulations in Three Dimensions 

While the method presented in this chapter was described and implemented in two 
dimensions, extensions to three dimensions are possible. The basic approach would 
remain the same: assign object features to finger edges, using an interpretation tree 
to guide the search. For two dimensions, edge-edge and edge-vertex placements are 
considered when building the tree. When a set of assignments provide three independent 
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Figure 3.36: Trial 2: generated and rejected poses. These poses were found by the generator, 
and rejected by the verifier using either the joint torque of intersection constraints. 

constraints, it is possible to solve for the object’s position. For three dimensions, face- 
face, face-edge, face-vertex, and edge-edge placements can be considered. When a set 
of assignments provides six independent constraints, the position of the object can be 
found. 

An important question to ask is if the constraints that are used, the hand shape and 
grasp acquisition strategy, are enough for problems in three dimensions. To gain insight 
on this, a set of simulations were performed. The hand’s fingers are assumed to be 
modeled as single edge segments. Objects were modeled as a set of edge segments. With 
this model, edge-edge contacts between the hand and an object are the most common 
to occur. Grasps of objects were simulated, giving a set of finger edges, some of which 
are in contact with the object. 

A three-dimensional version of the distance constraint was used to expand an inter¬ 
pretation tree. The tree is organized around hypothesized assignments of object edges 
to finger edges. A new assignment is tested to insure that its parent finger-object edge 
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assignment is compatible with the new finger-object edge assignment. If it is not, the 
assignment is pruned. Note that only the most basic distance compatibility test was 
performed. Range propagation between assignments in the path was not implemented. 
This makes the constraint significantly less powerful than it would otherwise be in a 
complete implementation. 

Initial results from the simulations indicated that using the basic distance constraint 
alone was inadequate. The portion of the interpretation tree that was expanded was 
too large, and the number of consistent candidates that were generated was prohibitive. 
Perhaps this is not surprising, as the combinatorics for three dimensions is significantly 
larger than for two dimensions. While an implementation of range propagation tech¬ 
niques was not performed, it is thought that it would help significantly. Nonetheless, 
additional pruning of the tree is clearly necessary. 

In previous sections, the grasp acquisition strategy was used simply as a verification 
test. Instead, it is possible to use the grasping strategy in the pose generation stage. 
For example, by using a move-until-contact strategy one can determine the links of the 
hand that are in contact with the grasped object. Here is how such a procedure would 
work. Each finger is rolled forward, from distal to proximal, one joint at a time. When 
contact is detected by monitoring the joint torque sensors, the next joint in the chain 
is rolled forward. If that joint can move, the contact was made on the previous link. If 
the joint cannot move, the contact is somewhere up the chain, and will be found latter. 
This process is repeated for all joints in the finger. 

With knowledge of the finger segments that are in contact with the grasped object, a 
substantial search reduction is possible. In particular, there are two benefits. First, the 
number of finger segments that are considered in the tree is greatly reduced. Second, 
the no-contact link from the tree is eliminated. The simulations described below use 
this additional source of information, and as will be seen, the results are promising. 

For each of the simulations, a table summarizing the computations performed is 
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Figure 3.37: Three dimensions trial 1: object and grasp. 

presented. The terms used in these tables are slightly different from those used in the 
previous simulations section, and are described here: 

• Object edges are the number of edges in the object. 

• Finger edges are the number of finger edges (links), including the palm. 

• Expanded nodes are the number of nodes in the interpretation tree that were ex¬ 
amined and that had at least one expanded child. 

• Pruned nodes are the number of nodes in the interpretation tree that were exam¬ 
ined and that had no children expanded. 

• Placement paths are the number of paths in the tree that were generated. 
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Trial 1: The grasped object is shown in Figure 3.37. Note that the highlighted finger 
edges are the ones that are in contact with the object. The results for this simulation 
are summarized in Table 3.9. Here, the columns indicate the percent of the full length 
of each finger segment that was used when generating the tree. The column labeled 
100% gives the results for the full finger segments, as shown in Figure 3.37. For lower 
percentages, the finger segments are reduced in length by the given amount. Essentially, 
the endpoints of the segment are adjusted along the line to reduce the segment length, 
while preserving the contact point between the finger and object edge. By reducing the 
finger segment length, the distance constraint becomes more powerful. This helps, for 
example, investigate how much additional recognition power would be present in a hand 
of similar total finger length, but with more finger links. 

For this trial, the simulation results are quite promising. The portion of the tree 
expanded is manageable, and the total number of poses that need to be generated is 
small enough to be feasible. Note that by reducing the finger segment length, a significant 
reduction in tree size was achieved. The results shown here are among the best that 
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Figure 3.38: Three dimensions trial 2: object and grasp. 
were achieved during a number of simulations. 

Trial 2: Not all simulations produced results as good as those shown in the previous 
example. An object similar to the one used in that example was grasped, as shown 
in Figure 3.38. The object had slightly different dimensions and a different orientation 
from the previous object. The simulation results are summarized in Table 3.10. 

For this trial, a rather large number of nodes in the tree were explored for the case of 
100% edge length. In addition, a large number of consistent paths of length 6 were found. 
As the edge length was reduced in size, as previously described, the results became more 
promising. For a 60% reduction, a manageable 6,740 candidates were generated. 
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object edges 

18 

finger edges 

6 

full tree paths 

34,012,224 



finger length 

100% 

80% 

60% 

expanded nodes 

196,960 

133,103 

24,536 

pruned nodes 

1,014,083 

733,765 

212,130 

placement paths 

104,798 

67,279 

6,740 


Table 3.10: Three dimensions trial 2: summary of the interpretation tree. 

3.8 Summary and Discussion 

This chapter described a constraint-based method for localizing objects grasped by a 
hand. The class of recognition problems discussed are important for utilizing robotic 
hands in manipulation tasks. In general, the experiments and simulations confirm that 
it is usually possible to unambiguously identify the pose of an object grasped by a 
hand using just joint angle and torque data. An unambiguous recognition is most 
likely when the object has many vertices that are different distances apart from each 
other. For objects with a great degree of symmetry, like a square, it is impossible to 
distinguish certain orientations. In the case of the square, of course, there is nothing 
that distinguishes 90 degree rotations, so any localization scheme would suffer from the 
same failing. 

The results obtained are very promising. Data gathered from actual objects grasped 
by the Utah-MIT hand indicate that pose determination and small set object recognition 
are feasible using joint sensor values obtained from just two fingers. The experiments 
utilized only 6 of 16 joint angle readings available from the hand. The performance of 
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the recognizer on this data was limited by the lack of a good model of the hand, and 
apparently not by the limited data. The surfaces of the hand are all different sizes and 
shapes, necessitating a complex model. At the current time we do not have such a model 
available. 

Simulations of the tree pruning portion of the algorithm in three dimensions were 
presented. The results again indicate that hand shape and the grasp acquisition strategy 
have considerable recognition power. The combinatorics in three dimensions, however, 
are of concern. The added pruning power that reducing the finger link length has on 
the problem indicates that addition contact location information, perhaps from tactile 
sensors or additional finger links, may be necessary to make the method’s performance 
acceptable. 






Memory-Based 
Pose Recognition 

Chapter 4 


4.1 Introduction 

This chapter presents a pose determination strategy based on a table-lookup operation 
from a memory. The memory is filled with hand shapes that result from grasping 
particular objects. Estimates of an object pose are obtained by matching a hand’s 
shape to the experiences that have been stored in the memory. Because the lookup 
operation is fast, determination of poses that have been encountered before is fast. 

The determination method uses what is called the grasp acquisition strategy con¬ 
straint. This constraint is simply the information inherent in the knowledge of how the 
hand was programmed to acquire objects. The pose determination memory is filled using 
this constraint. An interesting result from this chapter is that by using the grasp acqui¬ 
sition strategy as a constraint, the memory size is reduced and the number of ambiguous 
poses in each memory entry is reduced. Intuitively, a particular grasp acquisition strat¬ 
egy limits the hand shapes that can occur, which limits the number of ways an object 
can be grasped. This helps makes the use of a memory feasible. 
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Some compromises are made. The use of memory essentially trades off time for 
space. Because configuration space is tessellated, pose accuracy is proportional to the 
tessellation granularity. Techniques for improving the accuracy by using interpolation 
are discussed. Additional sensor information can also be used to improve pose estimates. 
Chapter 5 discusses such a technique, based on data obtained from fingertip force sensors. 

The term hand shape is used throughout this chapter. A hand’s shape is obtained 
from its kinematic model, and from its joint positions. In this chapter, the term joint 
angles is used interchangeably with the term hand shape. 

4-1.1 Relevance of this Problem 

The previous chapter developed a constraint-based method for obtaining object pose 
estimates. While the method avoids the potential combinatorial explosion of searching 
the entire pose interpretation space, it is not fast enough to qualify as being real-time. 
This chapter examines the use of a memory to speed the determination process. Faster 
pose determination is desirable in all cases. When a real-time pose determination is 
required, the faster the method, the better. 

4-1.2 Why Use Grasp Acquisition Strategy Constraints? 

The power behind this method lies in the observation that while, from a mechanical 
standpoint, a hand has a large configuration space, its high level control and planning 
strategies usually limit the space’s size. Consider the configuration space for the Utah- 
MIT hand. The hand has four fingers, each with four degrees of freedom. Thus, the 
space of possible hand positions has 16 dimensions. The less complex Salisbury hand, 
with three fingers, each with three degrees of freedom, has a smaller space of possible 
grasps, though the space is still huge. A grasp acquisition strategy provides a way to 
limit the number of shapes of the hand that can occur. 

Whenever a strategy is used to limit the number of potential hand shapes, it is 
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possible that useful shapes will be omitted. A useful hand shape is one that is required 
for grasping an object in a particular configuration. Thus, while a grasp acquisition 
strategy is useful for reducing the combinatorics of the hand configurations, it may 
provide sub-optimal grasping performance. One way to overcome this is by using a set 
of grasping strategies, each designed for a particular situation. 

The observation that a grasp acquisition strategy greatly limits possible grasps can 
be exploited for determination. For each particular position of an object in a hand’s 
workspace, a simulator for the grasp acquisition strategy can be run to determine the 
grasp that will result if the strategy worked. The hand shapes found in this process 
are the entire set of shapes that occur when grasping the object. As will be seen, the 
determination method described in this chapter is based on this principle. 

4-1.3 Chapter Overview 

The following sections will show how memory-lookup operations can be used for pose 
determination. Section 4.2 provides an overview of previous work. Section 4.3 outlines 
the assumptions required for the solution given. The approach is discussed in Sec¬ 
tion 4.4. Section 4.5 discusses the experimental setup and results. Section 4.6 presents 
conclusions. 


4.2 Related Work 

The work in this chapter is related to memory-based schemes for learning and modeling. 
Two approaches can be considered, those that store experiences directly, and those that 
represent experiences by a set of parameters. Atkeson and Reinkensmeyer [4] provide a 
good recent review of this field. 

When experiences are stored directly, nearest neighbor approaches are used to find 
similar experiences to a new event. Interpolation from limited examples is also possible. 
Methods for this are reviewed by Barnhill [6] and Sabin [89]. Local models can be 
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formed between the nearest neighbors for each access into the memory. See Watson [107], 
Cover [24], and Shepard [93] for examples of this approach. McLain [71], Stone [101], 
Franke and Nielson [37] and others use a distance weighted regression to fit polynomial 
surfaces to data. 

Though memory searches can be performed on serial computers, parallel machines 
make the searches much faster. Stanfill and Waltz [99] learn pronunciation by searching 
for related experiences using the massively parallel Connection Machine (Hillis [46]). 

Connectionist networks, or neural networks, provide an alternative way for represent- 
ing past experiences. See Rumelhart and McClelland [87] and Hinton [48] for overviews 
of the field. Essentially, a set of nodes and links are constructed to approximate a 
function that maps inputs to outputs. The use of the network permits both generaliza¬ 
tion over the training experiences, and more compact storage of past experiences in a 
memory. 


4.3 Assumptions 

The following assumptions are made in this chapter: 

• The hand has been modeled. 

• The objects have been modeled. 

• Either a grasp simulator, or an object pose determiner, is available. 

• The grasped object is assumed to be in static equilibrium. 

• Hand joint angle sensor data is available. 


4.4 Approach 

This section describes the memory based pose determination approach in detail. The 
only sensor inputs to the recognizer is the set of hand joint angles. The output from the 
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inputs lookup table outputs 



object model 



Figure 4.1: Overview of memory-based pose determination. 

recognizer is a coarse estimate of the object’s position in the hand. Figure 4.1 diagrams 
the method, showing the flow of information from inputs to outputs. 

4-4-1 Overvieiv 

There are a variety of approaches for organizing, filling, and using the determination 
memory. The matrix in Figure 4.2 outlines the possibilities that are considered. The 
memory can be organized around a tessellated object configuration space or around a 
tessellated hand configuration space. The memory can be filled by pre-computation, 
prior to determination, or on-demand, during determination. The memory can be used 
by extracting the best pose match, or by interpolating between a set of close pose 
matches. Note that one of the approaches, on-demand tessellated object space, is not 
possible. This will be explained later in this chapter. 

More formally, a determination memory is used to approximate a function R that 
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tessellate object 
position space 

tessellate hand 
configuration space 


pre- 

computation 



m with interpolation » 

without interpolation 



■ 

c(hand) is small 
or 

good GAS 


c(object) is small 
or 

good OPD 






on- 

demand 


c(obicct) > c(hand) | 
and 

with interpolation 


c(object) is small 
or 

poor OPD 


without interpolation 


c(hand) configurations of the hand 
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Figure 4.2: Possible memory organizations. A variety of approaches for organizing, filling, 
and using determination memories are possible. This table illustrates eight potential methods. 

maps hand configurations (shapes) to pose entries: 

R(C H ) = {P 1 ,P 2 ,...P n }, (4.1) 

where Ch is the configuration of hand II, and Pi is a pose. It is possible for R(Ch) — 0. 
A pose entry P{ contains: 

1. A pointer to the object model. 

2. A transform from the object model O to an instance of the object 0(x ) at position 
x. 

3. The hand configuration, Ch- 

The next sections will examine how this function is approximated using a memory. 
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4-4-2 Organizing the Memory 

The determination memory can be organized around a tessellated object configuration 
space or a tessellated hand configuration space. For tessellated object configuration 
organizations, memory locations are computed by finding the hand shape that result 
from grasping the object at a particular location. For tessellated hand configuration 
organizations, memory locations are computed by finding the object poses that result 
from a particular hand shape. 

The choice of organization to use is in part a function of the relative size of the 
tessellated spaces. For two-dimensional problems, a tessellated object space has three 
dimensions. For three-dimensional problems, the space has six dimensions. For tessel¬ 
lated hand configurations, the dimensionality of the space is equal to the number of 
degrees of freedom of the hand. 

Potentially, the space of possible hand configurations will be much larger than the 
space of object positions. However, the grasp acquisition strategy constraint can dra¬ 
matically reduce the effective size of the hand configuration space. Potentially, it can be 
reduced to one degree of freedom, even for problems in three dimensions. A hand being 
used like a parallel jaw gripper is an example of this. 

Tessellated Object Space 

A grasp simulator is used to compute memory locations for a tessellated object space. 
The grasp simulator G, when given an instance of an object 0, finds the hand configu¬ 
ration Ch that would result from the grasp: 

G(H,S,0(x)) = C h , (4-2) 

where O(x) is an instance of object 0 at location x , and S is the grasping strategy being 
used. 
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Tessellated Hand Space 

An object pose determiner is used to compute memory locations for a tessellated hand 
space. When given a hand configuration Cjj, a pose determiner finds the set of object 
poses that are consistent with the hand shape. This can be thought of as the inverse of 
Equation 4.2: 

G-\C„) = {0{x l ),0{x 2 ),...0(x n )}. (4.3) 

The constraint-based pose determination approach described in Chapter 3 provides 
an algorithm for computing Equation 4.3. The method finds all object poses that are 
consistent with a particular hand shape, assuming certain types of contacts and ter¬ 
mination predicates. Unlike with grasp simulators, object pose determiners make fewer 
assumptions about the world. For example no assumptions need be made about whether 
the object was stationary during the grasping operation. This advantage makes it easier 
to fill the memory accurately. 

Simulation Issues 

What are the requirements for a good grasp simulator? Most importantly, the simulated 
gripping process should correspond to how the hand would behave when actually grasp¬ 
ing the object. If the simulator’s output differs from reality significantly, the resulting 
memory entry would be meaningless. One of the most difficult factors to consider is mo¬ 
tion of the object during the grasping process. It is hard for a simulator to predict such 
motion correctly. As will be seen, an advantage of tessellating the hand configuration 
space is that it does not require a grasp acquisition simulator. 

It is interesting to note the relationship between this type of grasp simulator and 
a grasp planner. Grasp planners, like grasp simulators, must also consider the factors 
listed above. The actual grasp and finger motions that a planner generates are assumed 
to be executable by the hand. Thus, the planner must have some of the same knowledge 
that a simulator contains, in order to plan realizable motions. 
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Figure 4.3: Marginal grasp. A small change in the object’s position will cause a large change 
in the hand’s shape. Likewise, grasps are possible where a small change in the hand’s shape 
will cause a large change in the object’s pose. 

The grasp acquisition simulator, when applied to an object in a particular pose, must 
find the resulting hand shape. If the object is out of reach, no grasp will be found. For 
an object in reach, the resulting grasp can be categorized as follows: 

1. A small change in the object's position will cause a small change in the hand’s 
shape, and vice versa. 

2. A small change in the object’s position will cause a large change in the hand’s 
shape. 

3. A small change in the hand’s shape will cause a large change in the object’s posi¬ 
tion. 

In the first case, the pose is considered to be a good one, and the hand shape is a good 
indication of the object’s pose. For the second two cases, the mapping between hand 
shape and object pose is not as well defined (see Figure 4.3). Two problems occur with 
these marginal grasps. Coarse tessellations of a space could miss particular configura¬ 
tions, and sensitivity may be poor in the regions around the marginal configurations, 
depending on the space that was tessellated. 

To examine the issue of marginal grasps in more detail, refer to Figure 4.4. Each axis 
represents the level of tessellation of the labeled space. The plots in the figure can be used 
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Figure 4.4: Memory sensitivity. When mapping hand shapes to object poses, three classes of 
conditioning can be considered. Poor conditioning occurs in the third case, when a small range 
of hand shapes maps to a large range of object positions. 

to understand when the problem is poorly conditioned. In the first case, the conditioning 
is good, independent of which space is tessellated. In the second case, the conditioning 
is excellent when hand space is tessellated. If the object space is tessellated, a sparse 
sample of hand shapes would result. In the third case, conditioning is unavoidably poor, 
independent of tessellation considerations. Since a wide range of object positions map 
to a small range of hand shapes, using joint position sensing for determination cannot 
work well. Thus, the only case that is of concern is the second case. By tessellating the 
hand space this conditioning problem is avoided. 


4-4-3 Filling the Memory 

Both pre-computation and on-demand approaches are considered for deciding when to 
perform the computations required for filling the memory. If the tessellated space is 
small enough, it is feasible to pre-compute the memory. This is desirable, as on-line 
determination becomes a fast, constant-time lookup. If the space is too large, this pre- 
computation becomes impractical. As an alternative, portions of the memory that are 
encountered can be computed on-demand. This scheme reduces performance when a 
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new hand shape is encountered, though the results are then committed to the memory, 
which speeds future determinations. 

A benefit of the on-demand computation is that only the portions of the space that 
actually occur are computed. This reduces the size of the memory. The use of an on- 
demand approach can also reduce dependence on the grasp acquisition strategy. If the 
memory is organized around hand configurations, pre-computation requires knowledge 
of the hand shapes that are possible, as provided by the grasp acquisition strategy. By 
performing on-demand computations, this knowledge is no longer needed. 

In certain situations, the configurations that an object can assume are limited. The 
pre-computation approach is especially desirable in this case. For example, in a factory 
environment a robot may be acquiring a part from an assembly line. The part might 
have a nominal position, though vibrations and other sources of motion may introduce 
a certain amount of placement error. 

Note that on-demand tessellated object position space is not a possible approach for 
pose determination. If a new hand shape is encountered, the on-demand approach would 
require simulating all possible object positions to find the ones that are compatible with 
the hand’s shape. This, in essence, would be a pre-computation of the entire space. 

4-4-4 Using the Memory 

From an implementation standpoint, various approaches can be used for approximating 
the mapping function R, including hash tables, content addressable memories, and neural 
networks. An approach based on hash tables is used for the experiments conducted in 
this chapter. 

Hand configurations are entered into the determination memory using an index, or 
key. The index is generated from the hand’s finger joint angle array. A particular bucket 
size is used to cluster adjacent joint angles together. The bucket size is selected based 
on several considerations, including the actual accuracy expected from the joint angle 
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Figure 4.5: Determination memory entries. The contents of four different determination 
memory locations are shown here. The upper locations each contain one object pose, and thus 
uniquely identify the object’s location. The lower locations have multiple entries, and thus 
cannot uniquely identify where the object is position. 

sensors on the hand. Multiple scale indices can be generated, using different bucket 
sizes. 

Figure 4.5 shows the contents of four entries in a typical determination memory. The 
hand drawn in dark lines is the shape of the entry key. The hands drawn in lighter lines 
correspond to the objects in the particular entry. The memory entries represented by 
the upper figures contain just one object pose, and thus uniquely identify the object’s 
pose. The memory entries represented by the lower figures contain multiple entries, and 
thus do not uniquely identify where the object is positioned. 

Thus, a determination memory provides a direct mapping from a set of hand joint 
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angles to objects and their positions. Determination is achieved simply by generating a 
key from a hand shape and hashing into the memory. The entry will contain the one or 
more poses of the object that are consistent with the hand shape. 

4-4-5 Memory Interpolation 

Determination memories, as previously described, suffer from two related problems. The 
first problem concerns the lack of a limit on memory size. Finer tessellations result in 
more memory usage. While it has been argued that useful tessellations have a manage¬ 
able memory size for certain classes of problems, more efficient use of memory is always 
desirable. The second problem concerns close, but imperfect, matches. Essentially, a 
hand shape is indexed into the memory, and close matches are extracted. The pose that 
corresponds to the best match is selected as the candidate. It is possible to better esti¬ 
mate the pose by interpolating between related memory entries. This section explores 
both these ideas. 

Object grasps are considered to be in related configurations if the object vertex to 
finger segment assignments are the same. In these related configurations, continuous 
changes in the object position result in continuous changes in the hand shape. In fact, 
a particular function exists that maps hand shapes to object poses, for each vertex- 
segment configuration. Obtaining this function, and a closed form solution for it, is 
hard. It is also dependent on the particular hand’s kinematics. Because of this, a 
general second order function is assumed, and regression analysis is used to identify the 
function’s coefficients. As an example, consider a two jointed, two fingered hand. There 
exists a set of functions of the form: 

/«(*!, *2, ^3? ^ 4 ) — a l T «2^1 T a 3@2 T a 4@3 + <*5^4 + a 6@i T a 7^2 "4 ° 8$3 T a 9@4 T 

a 10@l@2 + a ll$l$3 -f <* 12 ^ 1^4 + a 13$2$3 + «14$2^4 4" <*15^3^4 
fyc(& l,02i @3-, ^ 4 ) = b\ + 62^1 + ^ 3^2 + ^ 4$3 + & 5^4 + b&9\ + by$\ + + 69^4 T 

blo0l&2 T ^11^1^3 T bi20\$4 + 6i3#2$3 + &14$2$4 + ^15^3^4 (4.4) 
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f$c{Q@3i 0 4) — C 1 + C2$l + C$02 + 0463 + 0564 + C§6\ + C^0\ + C%0\ + Cq0\ + 

c 10$l$2 + c ll0\03 + C\20\04 + c 13^2^3 + c 14^2$4 + c 15^3^4 

where c is the configuration number, 0, are the hand’s joint angles, and {a;,6 t -,c;} are 
the coefficients to be identified. In general, there will be 3(n 2 — 1) coefficients, where 
n is the number of joints in the hand. Note that there is one function for each of the 
unknown parameters required to position the object. 

Equation 4.4 can be used to both solve the problem of large memory size and to 
allow interpolation between existing memory entries. All entries in a configuration can 
be summarized by the a coefficients from the equation, eliminating the need for storing 
each joint angle to object pose mapping entry separately. This greatly reduces the size 
of the memory, making it proportional to the number of contact configurations, rather 
than the size of the tessellated space. The equation, when given a set of joint angles, 
will return the best estimate of the object’s pose that is consistent with the data used 
to find the a coefficients. This provides the desired interpolation capability. 

Each instance of the set of functions shown in Equation 4.4 is valid for a particular 
set of object vertex to finger segment assignments. For determination, only the hand’s 
joint angles are available. There is no knowledge of the particular contact configuration 
that has occurred. The question then remains as to which particular equation, from the 
set of c equations, should be applied. Put another way, a mapping from joint angles to 
contact configurations is required. 

One approach that can be used is to try all c sets of functions. Each set of equa¬ 
tions will return a particular object position. For all but one of the sets of equations, 
the contact configurations will be incorrect, and hence the computed object position is 
incorrect. To determine which function was the appropriate one to use, the computed 
object position is simply verified against that hand’s shape. If the position gives the 
contacts between the hand and object that were required for the set of functions, then 
the pose is accepted. Otherwise, it is rejected. 
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Joint 1 



Figure 4.6: Contact configuration memory. This figure diagrams the coarse memory that 
maps joint angles to contact configurations, for a hand with two joints. In this example, two 
contact configurations occur. Each entry contains the one or more contact configurations that 
occur for the entry’s range of joint angle values. 


A more efficient approach that can be used is to build a coarse memory that maps 
hand shapes to contact configurations, and hence directly to the determination functions 
(see Figure 4.6). Thus, a memory is built that is indexed by joint angles, and returns sets 
of determination equation coefficients. This memory plays a role similar to the initial 
scheme described, where the memory maps joint angles directly to an object position. 
However, a memory that maps joint angles to contact configuration information can be 
made more coarse, since it is not used to directly obtain the object’s position. Rather, 
it is used to map to interpolation functions, which then compute the position. 
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Figure 4.7: Distribution of pose configurations. The number of poses in each configuration, 
sorted by size, is shown here. 

Thus, the use of this interpolation approach can be summarized as follows: 

1. Sort the memory by configuration. 

2. Perform a regression on each configuration to identify the a coefficients. 

3. Build a coarse memory (one with relatively large buckets) to map hand shapes to 
contact configurations. 

Note that to perform the regression analysis for obtaining the a coefficients, at least 
as many equations as coefficients are required. If a particular configuration lacks the 
required number of entries, additional ones can be obtained by sampling the space more 
finely in the deficient area. 

To demonstrate the interpolation ability of this technique, a number of tests were 
performed on a sample memory generated using objects grasped by a two fingered, two 
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jointed hand. The following paragraphs describe these tests and present their findings. 

First, an examination of the number and .distribution of configurations found in a 
memory was performed. Figure 4.7 shows the distribution for an object with three 
vertices. A uniform tessellation of the object position space was performed, with 8000 
total positions sampled. Note that the first 15 or so configurations account for the bulk 
of the sampled grasps. Several of the configurations lack the required minimum number 
of poses for identifying the regression coefficients. Finer tessellation in those areas would 
be required. 

For each of the configurations that had at least the minimal number of required poses, 
a regression was performed. The Levenberg-Marquardt algorithm, as implemented by 
the Minpack [73] library, was used. The algorithm obtained the set of coefficients that 
accurately recovered the joint angle to pose mapping for the data that was used in the 
regression. In all cases, a stable set of coefficients that satisfied the minimization could be 
found. The coefficients, if based on a sufficient sampling of the configuration space, were 
found to allow a reasonably accurate interpolation between entries. The experiments 
suggest that the number of pose samples required for obtaining a good estimate of the 
coefficients varied widely. As one would expect, for hand shapes that covered a large 
range of positions, a relatively large number of pose samples were required. For smaller 
ranges of hand shapes, as few as 25 samples were required for obtaining reasonable 
position estimates. This supports the claim that a second order function is appropriate 
for mapping joint angles to an object pose. 

The following procedure was used to examine the interpolation capabilities in more 
detail. The memory entries for an exemplar contact configuration were randomly or¬ 
dered. The first n of these entries, where n is varied from the minimum required samples 
for interpolation, to the maximum number of entries in the configuration, were used in 
the regression. The remaining entries from the configuration, the ones not used in the 
regression, were then predicted using the computed coefficients. The error between the 
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Table 4.1: Interpolated poses. This table shows the desired position parameter, the actual 
parameter computed by the estimation function, and the percent error of that value. 

predicted and actual values were then found. A percent error was also computed, based 
on the total range in values for that parameter. 

Table 4.1 presents a number of pose location parameters that were computed using 
the estimation function. Notice that in most cases, the percent error in position is small, 
usually only a few percent. The error in 6 is slightly larger in general, though this is 
probably because of the larger tessellation that was performed in that dimension. 

Two configurations were compared, one that covered a larger range of object positions 
than the other. The first had 107 entries, and covered the smaller range. Figure 4.8 
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Figure 4.8: Typical interpolation error. The error in interpolation, for object x, y, and 9, is 
plotted against the number of configurations used in the regression. This plot is representative 
of the performance of the method under most tested configurations. 

plots the error in x, y, and 9 for a randomly selected position, as additional poses are 
used in the regression. The performance here is quite good. Figure 4.9 plots the same 
parameters for a larger configuration, one with 252 entries. In this case, performance is 
somewhat degraded. 

The stability of the estimation function’s coefficients was examined. Figure 4.10 
shows a plot of three coefficients against the number of samples used in the regression. 
Notice that as the number of samples increases, the coefficients approach a constant 
value, as would be expected. For the trial plotted in the figure, approximately 125 poses 
were required before a stable set of coefficients were found. This particular configuration 
covered a rather wide range of hand shapes, and needed a large number of poses to obtain 
a stable set of coefficients. For configurations that occurred over a smaller range of hand 
shapes, fewer coefficients were usually required. For some cases, as few as 25 poses 
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Figure 4.9: Large interpolation error. The error in interpolation, for object x, y, and 0, 
is plotted against the number of configurations used in the regression. Initially, the error in 
predicted position parameters is larger, due to the larger range of positions that are in this 
configuration. 

produced reasonable results. 


4-4-h Execution Algorithms 

In the next sections, two of the potential memory determination strategies are examined 
in more depth. The first method considered is for tessellated object space with pre- 
computation of the memory. The second considered is for tessellated hand space with 
on-demand computation of the memory. 

Pre-Computed Tessellated Object Space 

Pre-computed tessellated object space is appropriate when the number of object config¬ 
urations is smaller than the number of hand configurations, and either the number of 
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Figure 4.10: Estimation function coefficients. Three of the function’s coefficients are plotted 
against the number of samples used in the regression. 

hand configurations is small, or a good grasp simulator exists. The algorithm has two 
stages, the pre-computation stage to fill the memory, and the run-time determination 
stage. 

The determination algorithm, using interpolation, proceeds as follows (see also Fig¬ 
ure 4.11): 

1. Pre-computation stage: 

(a) Build the determination memory by sampling object configurations and sim¬ 
ulating grasps. 

2. Determination stage: 

(a) Obtain the determination memory key from the set of hand joint angles. 

(b) Index into the determination memory using the key. 



106 


Chapter 4 Memory-Based Pose Recognition 




Figure 4.11: Pre-computed tessellated object space. This figure shows a flowchart for the 
determination algorithm. 

(c) Order candidates from memory entry based on the closest matches to the 
actual joint angles. Interpolate object pose. 

If interpolation is not to be used, only the best match to the key is extracted, and is 
used for the pose estimate. 

On-Demand Tessellated Hand Space 


On-demand tessellated hand space is appropriate when the number of hand configu¬ 
rations is smaller than the number of object configurations, and either the number of 
object configurations is small, or a good pose finder exists. The algorithm has just one 
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Figure 4.12: On-demand, tessellated hand space. This figure shows a flowchart for the 
determination algorithm. 

stage, the run-time determination stage. 

The determination algorithm, using interpolation, proceeds as follows (see also Fig¬ 
ure 4.12): 

1. Determination stage: 

(a) Obtain the determination memory key from the hand’s joint angles. 

(b) Index into the determination memory using the key. 

(c) If there are no matches, find poses for the hand shape. Add poses to the 
memory. 

(d) Order candidates from memory entry based on the closest matches to the 
actual joint angles. Interpolate object pose. 
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Figure 4.13: Photograph of the planar hand. 

If interpolation is not to be used, only the best match to the key is extracted, and is 
used for the pose estimate. 


4.5 Experiments 

To test the ideas presented in this chapter, both simulations and trials using an actual 
robotic hand were performed. The simulations examined the characteristics of the de¬ 
termination memories as certain parameters were varied. Experiments on a two-linked, 
two-fingered planar hand were also performed, which is described in Appendix A. A pho¬ 
tograph of the hand is shown in Figure 4.13. Unless otherwise noted, the experiments 
were conducted using this planar hand with planar objects. 

For simplicity, the experiments conducted in this chapter were done using a move- 
until-contact acquisition strategy. Here, the hand’s joints are rolled forward, closing onto 
an object, until a joint torque limit is exceeded. The joints are closed from proximal to 
distal. When the most proximal joint can no longer move, the next joint in sequence is 
then servoed. 
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Figure 4.14: Experimental objects. 

4-5.1 Pose Determination 

For these experiments, a library of two objects was used, as shown in Figure 4.14. The 
pre-computation tessellated object space method was used. By building the memory 
with two objects, these tests perform both small set recognition and pose determination. 
For the recognition experiments performed here, different bucket sizes where used, where 
a determination memory is generated at each bucket size. The memory lookup operation 
is performed first with the smallest bucket size. If no matches are found, the memory 
with the next largest sized bucket is then consulted. The process repeats until matches 
have been found. 

Figure 4.15 examines the number of buckets in each memory. The z-axis indicates 
the relative bucket size. The y-axis indicates the total number of memory entries that 
were found in that bucket. In the following results, the memories from only the first two 
buckets are consulted. The larger bucket sizes proved to be unnecessary as they grouped 
too many entries together. 

Trial 1: For this hand shape, there were essentially two poses found, as the two right¬ 
most entries are slight variations of the same pose, as shown in Figure 4.16. In this 
figure, and the subsequent ones, the actual hand shape is drawn in a dark line, while 
the hand shape from the bucket key is drawn in a light line. The actual object grasped 
is shown in the right poses. Note that they form a closer fit to the actual hand shape. 


number of buckets with entries 
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Figure 4.15: Buckets per determination memory. 





Figure 4.16: Trial 1: three memory entries. 
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All these poses were found from the memory with the finest bucket size. 

Trial 2: In this trial, shown in Figure 4.17, the upper two poses were found from the 
most fine bucket size. The left most of these corresponds to the actual pose of the 
object. The right pose is from the same family of poses, as well. Note that no poses 
from the other object in the library were consistent with this hand shape, as is desired. 
For reference, the lower nine poses are from the memory with the next larger scale. Note 
that there still are no poses from the other library object in this set. 

Trial 3: For the trial shown in Figure 4.18, only one pose was found. This pose corre¬ 
sponds to the grasped object’s position. 

Trial 4 • For the trial shown in Figure 4.19, far more poses were found in the memory 
entry. The poses shown in the top row of the figure corresponds to the object’s position. 
Note that a large number of poses for the other library object were also found, in two 
totally distinct configurations. All these poses are from the most fine memory. 

Trial 5: For the final trial, shown in Figure 4.20, the top most pose was the only pose 
that corresponded to the hand shape data, from the most fine determination memory. 
It correctly corresponds with the pose of the grasped object. The remaining poses were 
found from the next larger scaled memory entry. 

4-5.2 Haptic Information Content of Hand Shape 

As more links are added to a hand, its haptic information content should increase. In 
the limiting case, an infinite jointed hand would totally conform to the shape of the 
object that it was enclosing, as if it were a rope tied around the object. 

One way to investigate the haptic information contained in hand shape is to examine 
determination memories. An experiment was conducted for this purpose. For a fixed 




Figure 4.17: Trial 2: eleven memory entries. 
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Figure 4.18: Trial 3: one memory entry. 

object, the number of links on a simulated two fingered hand was varied. The sum of the 
link lengths was kept constant. For each hand, determination memories were built by 
tessellating the object configuration space. The recognition power of a hand is defined 
to be inversely proportional to the ambiguities in the determination memory. An entry 
is defined to be ambiguous if there are more than a certain low number of poses in an 
entry bucket. 

Figure 4.21 presents the results of this experiment using a bar chart. The x-axis 
indicates the number of entries in a bucket for that bucket to be considered unambiguous. 
The y-axis indicates the percent of entries in the entire memory that are unambiguous. 
Each bar is divided into four categories, which indicate the number of links in each 
finger. From left to right, two, three, four, and five link fingers are considered. From 
the plot it can be seen that even in the worst case, with a two link hand, 68 percent 
of all memory entries are unambiguous. Adding three links dramatically increases the 
performance making 84 percent of the entries unique. If up to three entries in a bucket 
are considered unique, 94 percent of the memory entries are acceptable for a four linked 
finger. 

There is an interaction between the number of finger links, the bucket size used to 
group entries, and the percent of memory entries that are unique. As the entry key’s 
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Figure 4.19: Trial 4'- ten memory entries. 
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Figure 4.20: Trial 5: nine memory entries. 
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Figure 4.21: Number of links vs. memory ambiguity. Each cluster contains four bars, one 
for a two, three, four and five link finger. Each cluster denotes the maximum number of entries 
in a bucket that is considered to be unique, ranging from one to three. The y-axis of the plot 
indicates the percent of all entries that are unique. 

bucket size is increased, more entries will be assigned to the same bucket, as there are 
fewer buckets. The plot in Figure 4.22 shows this effect. The x-axis indicates the relative 
bucket size. The larger x values indicate larger buckets. The y-axis indicates the percent 
of memory entries that are unique. The results for two, three, and four link fingers are 
plotted. Note that as the bucket size increases, fewer entries are unique. Adding more 
links (which also increase the number of buckets) increases the percent of unique entries 
for all bucket sizes. It is interesting to note that the reduction in ambiguity as finger 
links are added is not as dramatic as might be thought. 

To consider how sensing might help improve the performance of this determination 
approach, the following experiment was performed. A memory was generated for a test 
object. The memory was sorted by number of poses in a bucket, where a bucket holds a 
set of similar hand shapes. In addition, another memory was built using the same hand 
and object, but with the additional information that would be provided by sensors that 
indicate if contact with a link has been made. The sensor information provides a way 
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Figure 4.22: Effect of hand fingers on memory ambiguity. This plot shows that larger number 
of finger links increases the recognition power of a hand, but not as dramatically as might be 
expected. 

to disambiguate among multiple poses that map to the same hand shape. If the poses 
result in a different set of link contacts, they can be distinguished. 

In Figure 4.23, the solid line plots the total number of poses that are found in each 
bucket size. Thus, an xy point indicates the total number of poses (the y value) that 
are in buckets that have x poses in them. Any bucket that has more than one pose in it 
is considered ambiguous. The dotted line plots the same numbers for a hand equipped 
with sensors. To eliminate the effect of related families of similar poses, a family of poses 
was considered as a single pose when computing these results. 

In the plots, notice that the number of poses in the larger buckets drops off signifi¬ 
cantly when using the contact sensing information. All buckets now have just one or two 
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Figure 4.23: Effect of sensing on memory ambiguity. The numbers plotted with a solid line 
are for a hand without sensing. The dotted line is for a hand with sensing. 

poses in them. Notice as well that by adding more links to the hand an improvement 
also results, though not as large. This result suggests that adding contact sensors to a 
hand improves its recognition power more than would adding a small number of extra 
finger links. 


4.6 Summary and Discussion 

This chapter described a memory-based method for localizing objects grasped by a 
hand. Various approaches for organizing, filling, and using the determination memory 
were explored. Experiments were performed using both simulations and a planar hand 
with two fingers, each with two links. 
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The results show that memory-based determination can be used to localize an ob¬ 
ject’s pose. The use of the grasp acquisition strategy constraint makes the memory size 
manageable. Memory can be further compacted by using regression-based curve fitting. 
Regression also provides a way to interpolate between entries to obtain a more precise 
pose estimate. 

For problems in full three dimensions, memories are still useful, though certain factors 
must be considered. Tessellations of the object space may be practical only under certain 
restricted conditions. For the full three dimensions, six parameters must be varied. For 
all but very coarse spacing, it would be too time consuming to perform such a tessellation. 
However, if objects are resting on a stable face on a table, and if the hand approaches 
the object from a fixed direction, such tessellations may be possible. In this case, n 
tessellations in two dimensions would be required, where n is the number of object 
faces. For tessellated hand configurations, three dimensions can be handled as long as 
the grasp acquisition strategy restricts the dimensions of the hand space adequately. 
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5.1 Introduction 


When manipulating a grasped object, especially with a robot hand, it is helpful to have 
an estimate of the object’s orientation within the grasp. The object’s orientation can 
be extracted from knowledge of the surface normals at the various points of contact 
with the object, but these surface normals must first be transformed into a common 
coordinate system. Successful execution of these transformations requires a prohibitive 
amount of accuracy in calibration of the arm and hand. This chapter presents an on¬ 
line method to improve these transformations in spite of calibration errors. The method 
requires collecting contact force readings as the object is manipulated, and computing 
transform corrections that minimize the variation in the sum of the contact forces. Both 
experimental and simulated results are presented, and the implications of the results are 
discussed. 


121 


122 


Chapter 5 Sensor-Based Pose Refinement 


5.1.1 Relevance of this Problem 

When an object is grasped by a robot hand, the eventual position and orientation, or 
pose, of the object is difficult to accurately predict. When performing even the most 
rudimentary manipulation of the object (for example, placing it somewhere else), it 
is helpful to have some additional knowledge of the object’s pose relative to a fixed 
coordinate frame. 

Previous chapters of this report developed techniques for coarse pose estimation 
relative to the reference frame of the hand. In some cases, more precise estimates are 
required. In addition, since the previous techniques used just hand shape and grasping 
strategy information, solution ambiguities could occur. Additional sensor information 
can be used to overcome these problems. 

Sensor-based refinement techniques must combine local sensor readings into a global 
pose estimate. This requires accurate transforms from local coordinate frames to a global 
coordinate frame. This chapter shows that the calibration and sensing requirements for 
obtaining accurate global data are severe. In fact, simple operations such as adding the 
contact forces together to find the object’s weight gave meaningless results. Techniques 
for self-calibration are presented in this chapter, as they can reduce this problem. 

5.1.2 Source of Information 

There are various sources of information available for finding the pose of an object 
relative to a fixed world coordinate frame. One source of global information is the set 
of joint position sensors. The locations of the intended contact points can be calculated 
using the robot’s forward kinematics. For fingertip grasps, for example, the contact 
points will be at the locations of the fingertips themselves. This information aids in 
determining the pose of the object, but only within the limits of calibration and model 
errors. 

Another source of information is force and surface normal measurements at the points 
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Figure 5.1: Overview of pose refinement. 

of contact. Fingertip sensors, such as those designed by Brock and Chiu [15], can provide 
this type of information. Surface normals, in particular, can be fit to an object model 
to solve for the object’s orientation. Since the surface normal measurements are local 
measurements, however, they must be transformed into a fixed frame before they can 
be of any use. One way to do this is simply to use the robot’s inverse kinematics. Of 
course, this is susceptible to the same calibration and model errors as the first method. 

In principle, the object’s orientation estimate can be improved by assuming that there 
are calibration errors, and by attempting to solve for corrections to the transforms from 
the contact points to a fixed coordinate frame. This can be done by taking a number of 
force readings at different object orientations and finding the correction transforms that 
best fit the requirement that the sum of the contact forces must equal the object weight 
at every object orientation. Here, contact forces are used as an independent information 
source to augment the information we have on the configuration of the robot. Figure 5.1 
diagrams the process used to refine an object’s pose. 

Calibration problems are compounded by the object grasping forces. When the 
fingers constrain an object, they exert an internal force. This force can be quite large, 
depending on the grip’s firmness. The signal being extracted from the fingertip sensors 
is the total external force applied to the object due to gravity. The noise that must be 





124 


Chapter 5 Sensor-Based Pose Refinement 


overcome is the internal grasping force. It is typical for an object to be grasped with 
an internal force several times that of its weight. This problem can only be solved by 
having a good calibration. 

From another standpoint, this work provides a method for continuously calibrating 
the fingertip sensors. As an object grasped by a hand is moved by an arm, continuous 
sensor readings can be obtained and used to refine the system’s estimate of the object and 
fingertip orientations. The relatively small amount of computation required is suitable 
for real-time applications. 


5.1.3 Using World Invariants 

The first part of the pose refinement method uses the constraint that the sum of the in¬ 
ternal grasping forces must equal the object weight, or simply a constant vector oriented 
in the direction of gravity. Sensor readings are taken in the fingertip or sensor frame. 
Errors are introduced when these readings are transformed to the common world frame. 
The method then attempts to find a correction to each fingertip’s orientation to make 
the constraint on the sum of the forces hold true, after the tip forces are transformed to 
a common coordinate frame. 


5.1.4 Chapter Overview 

The following sections will show how contact force and surface normal information can 
aid in determining the orientation of a grasped object. In particular, multiple samples 
of contact force information are used to help correct for robot calibration and model 
errors. Section 5.2 provides an overview of previous work. Section 5.3 outlines the 
assumptions required for the solution given. The approach is discussed in Section 5.4. 
Section 5.4.1 covers the process of finding an object’s orientation with perfect kinematics 
and sensors, using a set of three linearly independent surface normal measurements. 
Section 5.4.2 shows how this estimate can be improved by taking force readings with 
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the object in various orientations and grounding the resultant of the forces to the object 
weight. Section 5.4.3 shows an alternative approach for refinement based on a small- 
angle approximation. Section 5.5 discusses the experimental setup and their results. 
Section 5.6 develops a simulator used for predicting fingertip sensor readings. The 
simulations are used to examine the performance of the refinement algorithms, and 
to better understand the observed experimental results. Finally, Section 5.7 presents 
conclusions. 


5.2 Related Work 

Related work falls into the categories of haptic object recognition and kinematic cali¬ 
bration. Haptic object recognition is the problem of using the sensors on a robot hand 
to recognize an object. This problem is often examined from a model-based feature 
matching framework, such as in Gaston and Lozano-Perez [38] and Ellis [31]. Object 
properties such as face normal directions and contact point distances are measured and 
matched against a model. In contrast, Lederman and Klatzky [67] examine how hand 
motions can be used to recognize objects. Allen and Bajcsy [3] and Allen [2] propose a 
method for building surface maps using tactile sensors and vision. The surface features 
can be matched against an object database to identify the object and its pose. These 
works make no particular mention of the issue of calibration of the hand’s kinematics. 

Hollerbach [50] provides a recent review of the field of kinematic calibration. Cal¬ 
ibration of hands has been studied as a problem of closed-linked kinematic chains by 
Everett and Lin [32], and Bennett and Hollerbach [10, 9]. 

This chapter contributes to the work mentioned above by presenting an active cali¬ 
bration process that can be used while the hand is performing its normal manipulations. 
Multiple sensor readings obtained in the course of the motion are used to refine the 
sensor orientation estimates. This is made possible because we only need to correct 
fingertip orientation in order to extract object orientation. Locally correct estimates of 
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object orientation are acceptable, as long as they are kept up to date. This means that 
complete calibration of the configuration of the arm and hand is not required. 

5.3 Assumptions 

The following assumptions are made in this chapter: 

• Contact forces and estimated surface normals can be obtained at all points of 
contact with the object. 

• Known changes in object orientation can be generated (this may correspond to 
having accurate arm kinematics and inaccurate hand kinematics). 

• The grasped object is polyhedral, and the faces containing each of the contact 
points are known. 

• The grasped object is assumed to be in static equilibrium. 

• The error in the contact force and surface normal measurements (the sensor read¬ 
ings) is negligible in comparison to the error in the position and orientation of the 
sensor frame. 


5.4 Approach 

The pose refinement process is described in two stages. First, a method is presented for 
refining fingertip contact normals and finding the weight of the grasped object. This 
method relies only on fingertip sensor data. Next, the added constraints provided by the 
object’s model and the fingertip to object face assignments is used to refine the object’s 
orientation. 

Several reference frames that are used in the subsequent sections are first defined: 

U fingertip i frame 

Wi world frame seen by fingertip i 

w world frame 
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The fingertip world frames are obtained from the robot’s forward kinematics and the set 
of joint position sensors. They are inaccurate due to calibration errors. Essentially, the 
method described here will find a correction to take Wi to w for each fingertip. 

5.4-1 Recovering an Object’s Pose 

If the transforms obtained from the inverse kinematics of the robot are accurate, finding 
the object orientation is easy. To perform an unambiguous fit, contact information must 
be available on three surfaces of the object with linearly independent normals. The 
object orientation desired is the best rotation of the modeled object to bring the surface 
normals M •" into alignment with the measured surface normals N™. The measured 
surface normals N™ are transformed from the local sensor frames to the world frame by 

N? = T t '“N t i i . (5.1) 

The object model normals must then be aligned with the set of world coordinate normals, 
Nj°. One error function that could be used to perform the alignment is given by 

£ = £( A0 t ) 2 , (5.2) 

•=i 

where A0,- is the angle between the measured and model normals in the proposed object 
orientation. The object orientation that minimizes the error term is the most consistent 
guess available for the given normal measurements. 

5-4-2 Refining Inaccurate Contact Normals 

In reality, the inverse transforms from the contact sensors local frame to a world frame are 
not particularly accurate. For a setup using a Salisbury [90] hand and a Puma arm (see 
Appendix A), the contact forces obtained are so inaccurate that a simple experiment 
to weigh an object gives meaningless results. Thus, before performing the procedure 
described in the previous section, corrections to the fingertip transforms T™ must be 
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found. This section details a method to find these transforms by combining multiple 
sensor readings using the direction of gravity (or the direction of the object weight) as 
an invariant. 

If the inaccurate world frame for tip i is denoted u>,-, then an accurate surface normal 
N™ is found from 

n? = t;tz-nI\ (5.3) 

where T™. is the correction transform that must be found. The constraint used here is 

W % 

that the sum of the fingertip forces must equal the object weight: 

j^Fr^mg. (5.4) 

t=i 

And of course, the contact forces are transformed to the world frame in the same manner 
as the surface normals: 

Fr = T^'Ft'. (5.5) 

If we put the gravity vector along the z-axis, we get: 

EFT, = 0 

•=1 

EFT, = 0 ( 5 . 6 ) 

1 = 1 

EFT, = mg. 

i =1 

The correction transforms T™. must be such that they satisfy Equations 5.5 and 5.6. If 
the fingertip sensors are sampled in multiple object orientations, an error function can 
be defined as: 

E 2 ,=E{ (FT,,) 2 + {FT,) 2 + (FT, - ms) 2 } • (5-7) 

If p force samples are collected, the values of T™. that best satisfy 

E E 2 , = 0 . 

s = 1 


(5.8) 
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give refined sensor to world transforms that can be used to update the contact force and 
surface normal measurements. 

One added problem that must be considered is the weight of the tip itself, which 
is a significant 0.16N. The force output F will only be accurate when the tip is in its 
calibration orientation. To correct for this, the tip is calibrated with its z-axis parallel to 
the gravity vector in world coordinates. A good assumption is that in this configuration, 
the tip weight acts directly through its origin. Thus, the force that was erroneously 
subtracted out during calibration can be restored by adding the tip weight to the z- 
component of all calibrated sensor force readings. These can then be corrected for the 
effect of the weight in the current configuration by projecting the weight into the current 
tip frame and subtracting this from the result. The accuracy of this correction, of course, 
depends on the accuracy of the transform from the world coordinate system to that of 
the tip. More formally, if F-' r are the raw tip force readings, the corrected readings, Ff', 
can be obtained from 

F? (5.9) 

where W™ is the weight vector of fingertip i, in world coordinates. Thus, Equation 5.5 
can be written in terms of the raw forces as 


F t w 


K T u (Ff - t^w?) 


( 5 . 10 ) 


The error term in Equation 5.7 is computed using the forces obtained from this equation. 

The three correction transforms have a total of nine unknown rotation parameters, 
but the optimization method can only be used to solve for eight of these. The method is 
useful for making the tip world coordinate frames internally consistent, and for aligning 
the z-axes of these frames with the actual world z-axis (or with the direction of gravity), 
but it cannot align the tip world x and y axes with those of the actual world coordinate 
system. While the z-axis is in the direction of gravity, there is no natural phenomenon to 
distinguish x from y. Because of this, a zyz Euler angle representation for the correction 
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rotations is used (Craig [25]), 


cacft 

sacft 

-*P 


casftsj — sacj casftc') + sas~) 
sasfts-y + cac'y sasftc'y — cats') 
efts') cftcy 


(5.11) 


and the last rotation about the 2 -axis in the correction vector of one of the tips is 
arbitrarily set to zero. After a solution is found, a rotation about z is applied to all of 
the tip rotation corrections to minimize the sum of the squares of the angles of correction 
from the initial guess, or the guess obtained from the robot inverse kinematics. The 
initial guess may not be very good, but it is the best independent estimate available. 
Note that the magnitude of the object weight, m, can be supplied as an additional 
parameter in the minimization if it is not known. 

Once the correction transforms have been found, they can be used in Equation 5.3 
to transform the measured surface normals, which can then be used to find the object 
orientation as shown in the previous section. 


5-4-3 Refinement Using Small Angle Approximation 

One problem with using the minimization numerical technique for finding the tip ori¬ 
entation corrections is its susceptibility to local minima. As with any minimization, if 
the function being solved contains local minima, false solutions may be obtained. Given 
a good initial guess, local minima will hopefully be avoided. However, non-numerical 
techniques are certainly better. This section develops an alternative approach based 
on a small angle approximations of the rotation matrix that represents a tip correction. 
This approach has the advantage of linearizing the problem, giving a system of equations 
that can be solved using a matrix inversion. 

The zyz Euler angle representation for a rotation given by Equation 5.11 can be 
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linearized (Paul [81]) to 

1 —a 
a 1 —7 

-fi 7 1 


(5.12) 


Using this linearized form of a rotation, and from Equations 5.5 and 5.6, 
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(5.13) 


In this form, the desired correction angles are obtained by solving for the Euler 
angles. The forces, / are obtained from the sensors, and transformed to the nominal 
world coordinate system using the modeled kinematics. Rearranging Equation 5.13 into 
the form 


AX = B, (5.14) 

and eliminating the equation for the final z rotation (which cannot be solved for), gives 
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~ fix — fix — fix 
—fly ~ fly — fly 
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All force readings, /*, are assumed to be in frame tuj. The superscript s refers to the 
sample number. As before, 73 cannot be found. Equation 5.14 has eight equations and 
eight unknown, and thus A -1 is easily obtained. Additional equations can be obtained 
by using more than the required minimum of three samples. In this case, A -1 could be 
obtained by a pseudo-inverse. If the object weight was unknown, Equation 5.15 could 
be augmented with the equation from the third sample that was omitted. 

5.5 Experiments and Results 

These experiments test the ideas presented in the previous sections. As outlined above, 
the direction of gravity is used as an invariant in finding sensor frame correction trans¬ 
forms. The first experiment attempts to weigh objects. While measuring an object’s 
weight is not a particularly interesting experiment in itself, it provides an easy test of 
the refinement technique. It is much easier to verify an object’s weight than its global 
orientation. The next experiment examines the refinement of contact normals. Again, to 
avoid the issue of global workspace calibration, objects with parallel faces are grasped, 
where the fingertip normals are known to oppose each other. Before describing the 
experiments, the setup and procedure are detailed. 
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5.5.1 Experimental Setup 

The experiments described in the next section were conducted using a Salisbury [90] 
hand mounted on a Puma 500 arm. A photograph of the robot is shown in Figure 5.2. 
In this setup, contact forces and surface normals are obtained from Brock and Chiu’s [15] 
force sensing fingertip sensors (Figure 5.3). Each sensor has a six axis load-cell, built 
with eight strain gauges arranged in a Maltese cross configuration. The gauges are 
paired off with each other, one on each side of the beams that form the cross. The 
sensor surface is polished aluminum. See Appendix A for a detailed description of the 
robot and its computational architecture. 

The sensors are calibrated using a specially designed apparatus that can apply forces 
and torques in known directions. The calibration process involves probing the sensor 
and recording the strain gauge outputs. This data gives a forward calibration matrix. 
A Morse-Penrose pseudo inverse is computed to obtain a conversion matrix from sensor 
readings to force values. If C is the experimentally determined calibration data, and S 
is the 8x1 strain gauge reading vector, then 

F = C~ l S. (5.16) 

This approach, however, does not take into account the effect of the weight of the 
fingertip on its own sensor readings. As previously discussed, the orientation of the tip 
must be known, and an appropriate weight correction must be applied. 

Most of the error in estimating the position and orientation of the sensor frames comes 
from the hand. The Salisbury hand lacks both joint angle sensors on the finger joints 
and encoder absolute zero marks. Instead, motor positions are used to estimate joint 
positions. Due to compliance in the tendon system, this estimate is not very accurate. 
To further compound the problem, the hand must be manually zeroed at startup. 

Experimentally, it was found that refinement using the small angle approximation 
(Section 5.4.3) did not work well. This can be explained by the rather large correc¬ 
tion angles that were found to occur. This would cause the small angle assumption to 
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fail, giving erroneous results. Because of this, the minimization method for finding the 
corrections was used instead. 

The correction transforms were obtained using the Minpack [73] package. This pack¬ 
age of Fortran subroutines is used to solve numerical minimizations of systems of equa¬ 
tions. In particular, the LMDIF subroutine was used, which implements a modification 
of the Levenberg-Marquardt nonlinear least squares algorithm. The minimization solved 
by LMDIF is given by 

min |yi/i(#) 2 : xei? n | (5.17) 

where /,(x) is obtained from Equation 5.7, and p is the number of sensor samples. 
LMDIF computes the required Jacobian matrix using a forward-difference approxima¬ 
tion. 


5.5.2 Weighing an Object 

This section shows the results from a few representative attempts to weigh an object. 
A more general discussion of the results is then presented in Section 5.7 below. 

Figure 5.4 shows the measured force sums for all samples of one run along the world 
x, «/, and 2 -axes. These sums are shown before the correction transforms have been 
applied. The object’s actual weight, as shown in the figure, is 1.56N. Note that all the 
trials in this section used 15 sensor samples for the minimization. As will be discussed in 
the next section, this turned out to be inadequate. A larger number of samples should 
have worked much better. 

Figure 5.5 shows force sums from the same run after the correction transforms have 
been applied. The object weight was supplied as a variable (with a guess of 1.0N). Note 
that the force sums in the x and y directions are now nearly zero, as they should be, 
and the force in the 2 -direction is nearer to the actual object weight. 

Table 5.1 shows the rotation correction Euler angles (in radians) returned from three 
runs executed at different points in the workspace. The object weights returned from 
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Figure 5.4: This plot shows the sum of the x, y, and z forces, before the tip orientation 
corrections have been applied. 
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Figure 5.5: This plot shows the sum of the x, y, and z forces, after the tip orientation 
corrections have been applied. 
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Table 5.1: Tip rotation corrections. The rotation corrections in radians, for each tip, for 
three trials at different points in the workspace are shown. 


Weight 

Optimized 

Average 

1 

1.50 

1.86 

2 

2.02 

2.18 

3 

2.17 

2.58 


Table 5.2: Optimized and average object weights. The optimized and average object weights 
(in Newtons) obtained for the same three trials are shown. The actual object weight was 1.56N. 

the same three runs are shown in Table 5.2. The optimized weight is the weight returned 
by the minimization process, and the average weight is the average over all samples of 
the magnitude of the corrected force sum. These runs were also done with an initial 
guess at object weight of 1.0N. The actual object weight is 1.56N. The error terms for 
the three runs, by sample, are shown in Figure 5.6. 

Figure 5.7 shows the effect of supplying different guesses of the object weight on the 
optimized weight returned. The actual weight of the object grasped in these trials is 
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Figure 5.6: Plot of error term, by sample. This plot shows the error terms, by sample, 
returned by the minimization for the three trials shown above. 

2.2N. 


5.5.3 Improving Contact Normals 

The accuracy of the results obtained can be further verified by examining the corrected 
surface normals. The grasp used in the experiments had two fingers (tips 1 and 2) 
opposing the thumb (tip 3). The corrected surface normals should reflect this, and 
oppose each other. Table 5.3 shows the average surface normal measurements obtained 
before and after tip orientation correction. There is again a noticeable improvement in 
the results. If the angle between vectors is used as a measure, then tips 1 and 2, which 
should be parallel, show a difference of 0.23 radians before correction and 0.19 radians 
after correction. Tips 1 and 3, which should oppose each other, show a difference of 2.61 
radians before correction and 2.93 radians after, and tips 2 and 3, which also oppose 
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Figure 5.7: Variation based on initial guess. This plot demonstrates the variation in final 
weight convergence value, based on the initial weight guess. 


0 

Tip 1 

Tip 2 

Tip 3 

X 

0.54 

0.70 

-0.21 

y 

0.54 

0.51 

-0.29 

z 

0.64 

0.50 

-0.93 


9 

Tip 1 

Tip 2 

Tip 3 

X 

0.64 

0.73 

-0.47 

y 

0.28 

0.34 

-0.36 

z 

0.71 

0.59 

-0.81 


Table 5.3: Tip normal directions before and after corrections. The average normal directions 
for the first of the trials is shown both before (left) and after (right) the rotation corrections 
have been applied. 


each other, show a difference of 2.43 radians before correction and 2.80 radians after. 
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5.6 Simulations 

Simulations were conducted to gain some insight into this somewhat disappointing ex¬ 
perimental performance. A model that predicts the forces felt by the fingertip sensors is 
used to generate synthetic data. By using this model, a more rigorous investigation of 
the refinement process is possible. Various controlled simulations will examine the con¬ 
vergence properties of the numerical methods under differing conditions. The accuracy 
of the small angle linearization method will also be examined. 

An important question that has not yet been answered is whether pose refinement is 
possible at all. Both the numerical minimization method and the small angle lineariza¬ 
tion method make the assumption that by reorienting a grasped object an independent 
set of sensor readings can be obtained. Some lurking dependencies in the measurement 
might make it impossible to solve for the orientation corrections. Through intuition, it 
has already been mentioned that a world rotation around the z axis cannot be recovered. 
Perhaps there are other parameters of the correction that also cannot be found? An ar¬ 
gument can be made that shows by reorienting the hand it is possible to find the point 
where each fingertip is aligned with the gravity vector. This gives hope that fingertip 
corrections can be found by using more general motions of the hand. The simulations 
in this section will show that, aside from the world z rotation, all fingertip orientation 
parameters are deducible from multiple readings of the sensor data. 


5.6.1 Simulator Design 

Before discussing the simulation experiments, a model for predicting fingertip grasp¬ 
ing forces is presented. This model is used to obtain the synthetic data used in the 
simulations. Refer to Figure 5.8 for the notation used in the subsequent derivation. 

The grasped object is assumed to be in static equilibrium, with gravity as the only 
external force. Without loss of generality, the coordinate system is assumed to be at the 
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Figure 5.8: Grasping force model. The grasping forces, fi, can be found using torque and 
force balance equations, along with a set of internal forces c t . 


object’s center of mass. The force and torque balance equations are written as 


mg = 


(5.18) 


«=1 

3 


o = 5> x /- (5.19) 

*=i 

where mg is the object’s weight, r, is the vector to the fingertip contact point, and /, 
is the fingertip contact force. For simulation purposes, the object weight and contact 
points are specified. The contact forces are desired. Since Equations 5.18 and 5.19 
can each be written as x , y, and 2 component equations, there are six equations and 
nine unknowns. The three additional parameters that must be specified are commonly 
referred to as the internal forces, and can be defined as follows: 


ci = r 12 • (/i - / 2 ) 

c 2 = r 23 • {fi — J 3 ) (5.20) 

c 3 = ^31 ' {fz ~ fl)- 

Note that these internal force equations are arbitrary. Any three independent equations 
that relate the fingertip forces can be used. These particular equations capture the 
notion that the internal forces are the amount of squeezing force between each of the 
fingers. 
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Equations 5.18, 5.19, and 5.20 can be written in matrix form as, 
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. (5.21) 


This set of linear equations can be easily solved to find the fingertip forces. 

To obtain a set of simulated sensor readings as an object is rotated, the object weight 
mg, the internal forces c,, and the contact vectors r,, are specified. To rotate the object, 
the ri vectors are subjected to a rigid rotation. The new r t vectors are then used to 
compute the next set of fingertip force readings. This process is repeated to obtain the 
desired number of force samples. This simulation gives forces that would correspond to 
those generated by motion of a robotic arm with the hand’s finger positions fixed. 

This scheme assumes that the internal forces c, are constant throughout the rotation. 
In reality, these forces are a function of the control law used to servo the fingers. It can 
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y 


X 

Figure 5.9: Spherical coordinate system. By varying <f> and 0, a rotational error of magnitude 
p can be generated in a particular direction. 

be shown that the constant force assumption holds if a proportional controller is used 
to drive each fingertip to a set point somewhere inside the object. This corresponds to 
how the hand used in the experiments was controlled. 

In the final step for obtaining simulated data for the refinement process, the synthe¬ 
sized force readings for each fingertip are rotated by a fixed amount. This rotation is 
the error that the refinement process must recover. For some of the simulations, ran¬ 
dom noise and bias offsets are also added to the readings. This helps investigate the 
robustness of the numerical method. 

5.6.2 Orientation Error Direction 

To determine if the orientation of the fingertip error had an effect on the performance 
of the correction method, the following experiment was performed. A sampling of all 
possible error orientations was applied to a set of synthetic data. For each error orienta¬ 
tion, a correction was computed using the minimization algorithm. The error between 
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the actual correction and the computed correction was then computed. Error is defined 
to be the magnitude of the difference in 9 values of the equivalent axis rotations for the 
actual and computed orientations. The space of possible error directions is computed 
using a two parameter spherical coordinate system, as shown in Figure 5.9. If the mag¬ 
nitude of the error is p , the parameters <j> and 9 can be sampled to obtain a set of error 
vectors, 


x — p sin <f> cos 9 

y — psm<f>s'm9 (5.22) 

z = p cos (j> 


where [xyz] T are the a, ft, 7 error angles. 

Using the spherical system defined in Figure 5.9 and Equation 5.22, the convergence 
characteristics of a fixed single tip error directed in all possible orientations was exam¬ 
ined. The plot in Figure 5.10 shows the results of this simulation. The x and y axes 
represent values of <f> and 9 from 0 to 27r. The 0 axis represents the magnitude of the 
ratio between the actual and recovered orientation corrections. Note that in almost all 
directions, the method significantly improves the normal direction measurement. While 
in this example certain directions did not show as impressive an improvement as other 
directions, this was more a function of the particular samples, rather then a systematic 
failure of the method in a particular direction. 

The previous results were obtained from perfect data. In reality, at least two types of 
problems with the sensor data can occur. The readings are subject to random noise and 
to systematic calibration errors. A number of simulation experiments were performed to 
investigate how well the method performs under these more realistic assumptions, and 
are described in the next several section. 
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Figure 5.10: Convergence error versus applied error direction. The magnitude of the plot 
indicates the relative performance of the algorithm for the given phi and 8. The lower the 
value, the better the algorithm performed. 


5.6.3 Sensor Noise 

Table 5.4 presents the eight actual and recovered fingertip correction angles for simulated 
sensor data with varying noise levels. The angles, listed in order, are the zyx corrections 
for each finger. The first column shows the actual error applied to the eight recoverable 
fingertip orientation angles. Each subsequent column shows the correction recovered, 
given the indicated noise level. Note that the method is able to recover the correction 
accurately, even with levels of random noise exceeding 25 percent. To a certain extent 
this is to be expected, since on average the noise cancels itself out. Nonetheless, it is 
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0.3 

0.27 

0.27 
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0.28 

-0.2 

-0.24 
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Table 5.4: Effect of noise on recovered angles. This table shows the recovered eight correction 
angles, with varying amounts of noise added to the simulated data. The rows of the table show 
the correction angles for each of the recovered corrections. 
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Table 5.5: Effect of noise on object weight. This table shows the recovered object weight, 
with varying amounts of noise added to the simulated data. The first row shows the optimized 
weight, the second shows the average weight. 


promising that the numerical minimization is able to converge to the correct solution 
even when substantial sensor noise is present. 

Table 5.5 presented the actual and recovered object weight for simulated sensor 
data with varying noise levels. For these trials, the object weight was assumed to be 
unknown, and was included in the minimization error term. An initial guess of zero was 
used. In the table, the term optimized weight refers to the actual weight recovered by 
the minimization error term. The average weight refers to an average over all samples 
of the corrected weights. The optimized weight in general provides a more accurate 
solution. Notice that for even a large 45 percent noise, the weight is reliably recovered. 
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Figure 5.11: Recovered object weight by sample, without noise. The raw and corrected object 
weights are plotted, by sample, in this diagram. The best fit line for each set of samples is also 
shown. 

Figure 5.11 and Figure 5.12 show plots of object weight (sum of the z forces) for each 
sample from a particular trial. For these experiments, 50 samples were used. The object 
weight was 100N. The raw z force is plotted in dots, the corrected z force is plotted 
in stars. For each plot, the best fit line to the points is also shown. The plot shown 
in Figure 5.11 shows the results for simulated data without noise. The plots shown 
in Figure 5.12 shows the results for simulated data with 10, 20, and 30 percent added 
random noise. Notice that in all cases the method can accurately recover the correct 
weight. 


5.6.4 Orientation Error Magnitude 

Simulations were performed to investigate how the magnitude of the fingertip orientation 
error effects the performance of the correction method. For error magnitudes ranging 
from 0.1 to 1.4 radians, 625 trials were performed, where the error was varied across 
all possible directions. In addition, random noise ranging from 10 to 50 percent of the 
signal was added. Figure 5.13 plots the results of this experiment. In the plots, the 




object weight object weight object weight 
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Figure 5.12: Recovered object weight by sample, with noise. The raw and corrected object 
weights are plotted, by sample, in this diagram. The best fit line for each set of samples is also 
shown. From top to bottom, noise of 10, 20, and 30 percent has been introduced into the data. 
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error magnitude 


Figure 5.13: Percent convergence with noise. A constant error magnitude is sampled in all 
directions. For these samples, the percent that show a significant correction is computed. The 
percentage is plotted against the error magnitude. The upper plot uses a correctness threshold 
of 0.1 radians, the lower plot uses 0.2 radians. 
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percent of samples corrected are the percent of trials that are corrected to with 0.1 and 
0.2 radians of the actual normal direction. 

5.6.5 Sensor Calibration Bias 

Simulations were performed to investigate how sensor biases affect the performance of 
the correction method. Biases of 5, 10, and 15 percent was added to the sensor readings. 
In addition, random noise of 10 and 20 percent was added. Figure 5.14 plots the results 
of the experiment with 10 percent noise, while Figure 5.15 plots the results with 20 
percent noise. Notice that even small biases dramatically reduce the performance of the 
method. As will be discussed later, this problem was partly responsible for degrading 
the experimental performance that was observed. 

5.6.6 Small Angle Approximation 

Figure 5.16 examines the sensitivity of the linearized refinement algorithm to the small- 
angle approximation that it uses. The figure plots the magnitude of the correction error 
against the percent deviation of the correction. Percent deviation is defined to be the 
ratio of the correction error to the applied error. Each data point is computed from a 
number of trials each at the given error magnitude, in different directions. An acceptable 
correction deviation is obtained for errors under 1 x 10 -2 radians. Above that, the small 
angle approximation used to linearize the problem fails, and the corrections obtained 
are no longer accurate. 


5.7 Summary and Discussion 

This chapter presented a method for continuously calibrating the orientations of the 
fingertips of a hand to obtain more accurate measurements of contact normals and forces. 
The methods are needed because calibration errors in the robot’s kinematics significantly 
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Figure 5.14: Percent convergence with 20 percent noise and bias. A constant error magnitude 
is sampled in all directions, with a fixed sensor bias added to the simulated readings. For these 
samples, the percent that show a significant correction is computed. The percentage is plotted 
against the error magnitude. The upper plot uses a correctness threshold of 0.1 radians, the 
lower plot uses 0.2 radians. 
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Figure 5.15: Percent convergence with 30 percent noise and bias. A constant error magnitude 
is sampled in all directions, with a fixed sensor bias added to the simulated readings. For these 
samples, the percent that show a significant correction is computed. The percentage is plotted 
against the error magnitude. The upper plot uses a correctness threshold of 0.1 radians, the 
lower plot uses 0.2 radians. 
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Figure 5.16: Percent deviation of linearized correction. The x-axis denotes the applied 
error magnitude, plotted in a logarithmic scale. The y-axis shows the percent deviation of the 
correction for the given error. 

reduce the utility of the fingertip sensors. Without good global knowledge of forces and 
normals, certain recognition and manipulation tasks are very hard to perform. 

The calibration method discussed uses multiple sensor readings obtained while the 
robotic arm is moving a hand holding a grasped object. The fingertip to world coordinate 
frame transform is refined according to the constraint that the sum of the fingertip forces 
must always equal the object’s weight. 

The simulations and experiments conducted gave mixed results. In general, they 
indicated that the method works well for certain types of errors. For large random noise 
the method was able to recover the correct fingertip orientations. On the other hand, 
for relatively small sensor bias errors, the performance deteriorated rapidly. 

The experimental results were not particularly good. While some of the trials pro¬ 
duced good results, with an accuracy close to 5 percent, others trials had much worse 
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performance. Using the results of the simulations, three potential causes for the rela¬ 
tively poor experimental performance were identified. The next sections discuss them, 
and propose potential solutions. 

5.7.1 Poor Sensor Calibration 

The simulations indicated that while the method is relatively immune to noise, it is 
susceptible to calibration and other bias problems. Upon further investigation, the actual 
calibration of the fingertip sensors proved to be rather poor. The calibration process 
requires mounting the sensor on a stand, and applying known forces at particular points 
on the device. Unfortunately, the apparatus for applying these forces was rather crude. 
At first it was hoped that a calibration accuracy of 5 percent would be possible. In 
actually, the accuracy was probably no better than 15 percent. 

5.7.2 Inadequate Sample Size 

The simulations confirmed that the method’s susceptiblity to noise is greatly reduced as 
the number of samples is increased. For the noise levels introduced in the simulations, 50 
samples proved to be necessary. Due to limitations in the experimental procedure, only 
15 samples were used, apparently too few to mitigate the effects of the noise levels that 
were present. This alone would greatly degrade the convergence characteristics even at 
moderate noise levels. 


5.7.3 Incorrect Assumptions 

Finally, poor experimental performance could be a result of incorrect assumptions. The 
most critical assumption that was made is that all the error in sensor orientation was a 
result of calibration errors in the hand. Motion of the arm was assumed to be accurate. 
While all indications are that this assumption was valid, further investigation on this 
point is warranted. 
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Discussion 


This report examined the problem of finding the pose of an object grasped by a hand. 
The methods that were studied can successfully determine an object’s pose using a 
minimum of information. The use of kinesthetic sensing, along with knowledge of the 
grasp acquisition strategy employed by the hand, usually provided sufficient data for 
the task. In the case of the pose determination problems conducted with the Utah-MIT 
hand, just 16 numbers were used as input to the recognizer. The power of these methods 
can be attributed to their careful exploitation of the constraints inherent in the problem. 

Pose determination was argued to be an important problem for several reasons. For 
almost all manipulations, at least a certain amount of information on the location of 
objects both relative to the robot and relative to the world is required. Performing 
an accurate calibration is not enough, as error and uncertainty is unavoidable. Even 
strategies that are guaranteed to work usually have a bound on the error in object 
position that they can tolerate. The reality of robotics is that uncertainty is unavoidable. 
Pose determination provides a way to compensate for this uncertainly. 

Some typical problems that are suitable for the pose determination methods studied 
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in this report include parts alignment, grasp verification, and haptic exploration: 

• Parts alignment: A simplified multi-linked gripper operating on an assembly line 
could grasp a part, verify its pose, and then direct another robot to the part in 
the now known location. Thus, the gripper serves both as a position sensor and a 
clamp, much the same role as a conventional parts feeder. An advantage with this 
approach is that the system is not particular to the part. Retooling for new parts 
is simplified. 

• Grasp verification: When a dextrous hand has completed a grasp, position uncer¬ 
tainty in the system usually causes in a wide variation in the object’s final pose. 
Pose determination could be used to find the position, and then used to plan sub¬ 
sequent manipulations around the true object pose, rather than the inaccurate 
planned pose. 

• Haptic exploration: Hands, both human and robotic, can be thought of as sensory 
organs. It is possible for a hand to provide the sensory information necessary for 
a wide variety of recognition tasks. These exploratory motions are used for tasks 
such as reaching into a bag and pulling out a particular object and groping in the 
dark for a telephone receiver. The pose determination methods studied provide 
certain insight to the information sources that human haptics must utilize. 

The methods presented in this report show that hand shape and knowledge of the grasp 
acquisition strategy can provide useful information for solving these types of problems. 

6.1 Review of the Report 

The first algorithm, presented in Chapter 3, studied a method to find object poses 
that were consistent with a hand shape. Assignments of object vertices to finger edge 
segments were examined in a systematic manner, using an interpretation tree to guide 
the search. By carefully pruning inconsistent vertex-edge pairs, the portion of the tree 
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that was examined was minimized. This method found all potential placements of the 
object that were consistent with the hand shape, given certain contact assumptions. 
Generated poses were then verified according to several criterion. Most importantly, the 
pose and hand shape were checked to insure compatibility with the grasp acquisition 
strategy that was used. 

Next, Chapter 4 showed how a memory could be used to speed pose determination. 
By storing past experience in the memory, the on-line determination process can be 
reduced to a simple lookup operation. To improve accuracy, regression analysis was 
used on similar poses, providing a way to interpolation between memory entries. The 
use of regression also allowed the memory to be compacted. The interpolation methods 
were general, and did not depend on solutions that are particular to specific objects or 
hands. 

Finally, Chapter 5 examined how additional information provided by contact sensors 
could be used to refine a pose estimate. The pose of an object model can be fit to 
measured fingertip surface normals. The fitting process is only as accurate as the surface 
normals are themselves. Since sensors measure contact in a local coordinate frame, and 
since the fitting requires global data, accurate kinematic models and kinesthetic data is 
also required. In practice, calibration errors were found to greatly degrade the potential 
performance of this type of method. The chapter explored a refinement process that 
attempted to correct for calibration errors using world invariants. 

6.2 Hand Design for Haptics 

It is interesting to explore how the findings of this report could be of use to hand 
designers. This section briefly addresses this problem, in particular by exploring the 
notion that a hand should be designed as much for recognition as for manipulation. 

The role of hand shape for recognition, as used in this report , is twofold. It is used 
for both pose generation and for pose verification. 
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Pose generation relies on the existence of a minimum number of contacts between the 
object and the hand. By searching the space of possible contact assignments, potential 
object poses are found. Because a certain minimum number of contacts are necessary, 
the hand must have enough fingers and links to usually achieve the necessary number 
of contacts in a grasp. 

Pose verification relies heavily on the shape of the hand. The enclosure formed by the 
hand around the grasped object is tested for intersection against the postulated object 
position. A hand that forms a better enclosure will usually permit fewer collision free 
object placements. 

Both the generator and verifier stages of the algorithm benefit from hands with 
more fingers and links. However, additional links increase the contact assignment search 
space. The simulations in Section 3.7 suggest that for three-dimensional problems the 
combinatorics are rather large without some contact information. Those results indicate 
that the search time is greatly reduced simply by sensing if contact has been made with 
a particular link, without knowledge of the contact location. Further search reductions 
can be obtained either by reducing the link lengths, or by using more precise contact 
sensing. 

Experiments performed in Section 4.5.2 suggest that the the mapping from hand 
shape to object pose benefits from the addition of simple contact sensor information. 
Without any contact sensing, hand shapes frequently map to more than one object pose. 
For the cases where a unique mapping does not exist, unambiguous pose determination 
cannot be performed. If the hand links that are in contact with an object are known, 
the results from that section suggest that the mapping ambiguity is greatly reduced. 
The additional sensing power obtained by adding this type of sensing is larger than that 
obtained by adding a small number of additional links to each finger. This suggests to 
designers that adding minimal contact sensing, rather than just more finger segments, 
may be the best way to improve the recognition power of a hand. 



§ 6.3 Future Work 


161 


While it is clear that additional sensor information will always be helpful for pose 
determination, these results indicate that very little additional sensor information can 
be put to good use. In particular, the type of sensing information that has been shown 
to be useful is easy to obtain. Devices that can determine if contact with a link has been 
made can be reliably fabricated using existing sensor technology (see Section 2.3). 

Using the methods developed in this report , pose determination without these basic 
sensors suffers from both a search combinatoric explosion, and from ambiguity in the 
results. Adding minimal amounts of additional sensors is likely to correct this. 

6.3 Future Work 

The work presented in this report provides a promising approach for solving the pose 
determination problem. Nonetheless, much work remains. A few of the more interesting 
problems that deserve attention are listed here: 

• Three dimensions: The pose determination method presented in Chapter 3 was 
implemented in two dimensions. By examining the first stages of a full three- 
dimensional implementation, it was argued that this extension could be performed. 
The extension should be completed, and experiments conducted on data from 
dextrous hands. 

• Sensitivity: A better understanding of the sensitivity to sensor noise is warranted. 
In particular, for the methods in both Chapters 3 and 4, bounds on the performance 
based on joint angle sensor performance should be developed. If the sensors are 
too noisy, potential poses would certainly be missed. 

• Sensor fusion: A better investigation of how tactile contact sensors can be used 
with these methods would be worthwhile. For example, binary contact informa¬ 
tion, which would be easy to obtain, could be used to guide the tree search used in 
Chapter 3. The additional sensor information could be used to make the methods 
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Appendix A 


This appendix provides a detailed description of the hardware and software used by 
the experiments described in this report . Special attention is given to the compo¬ 
nents which the author helped design and implement. Section A.l describes the setup 
for the constraint-based localization experiments from Chapter 3, Section A.2 describes 
the setup used for the memory-based recognition experiments from Chapter 4, and Sec¬ 
tion A.3 describes the setup for the sensor-based refinement experiments from Chapter 5. 

A.l The Utah-MIT Hand 

A. 1.1 Mechanical Design 

The Utah-MIT hand [57] (Figure 2.2) was used for the constraint-based pose localization 
experiments that were described in Chapter 3. This hand has four fingers, each with 
four degrees of freedom. An anthropomorphic design was used, giving it much the 
same size and shape as a human hand. Each joint is connected to two tendons, one for 
extension and one for flexsion, giving the hand a total of 32 actuators. Specially designed 
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Condor 

System 


Figure A.l: Block diagram of the Utah-MIT hand system, as used for the constraint based 
recognition experiments. 

pneumatic actuators are used for power, and are housed in an external actuator pack. 
The 32 tendons are routed from the actuator pack to the hand using a remotizer. This 
arrangement permits off loading the weight of the actuators from the hand itself, making 
mounting on existing robots easier. 

The hand is mounted on a cartesian robot that provides three degree of freedom 
xyz motions. The hand itself is mounted on a gantry which provided xz motion. The 
remaining motions are provided by an xy positioning table. This setup gives a redundant 
motion in the x axis. The cartesian robot is actuated using stepper motors. 

A.1.2 Control Architecture 

The Condor [75] system is used to control the hand and cartesian robot. Condor is a 
real-time software environment designed for multiprocessor-based robotic control. The 
system provides interprocessor communication primitives, an efficient scheduler, and 
host computer support. A Sun workstation front-end, linked to the VMEbus multipro¬ 
cessor backplane using a memory mapped connection, provides development support. 
The development tools include a symbolic debugger, a virtual terminal system based on 
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Figure A.2: Block diagram of the MIT planar hand system, as used for the memory based 
recognition experiments. 

the X Window System, and a fileserver. Six Motorola 68020 processors are used in the 
system. The first processor is the system supervisor. The next four run the low level 
servo code, one processor for each finger. An additional processor is used to control the 
cartesian robot. The system achieves finger joint servo rates of up to 300 hertz. 

A remote procedure call (RPC) interface to the hand-arm control system is also 
provided. The localization system is coded in Lisp and runs on a Symbolic Lisp Machine. 
The Lisp Machine communicates with the Condor using the RPC interface. Some of 
the operations that are supported by the RPC interface include commands to control 
the servo system, to enqueue trajectories, and to query joint positions and torques. 
Figure A.l diagrams the system. 


A.2 The MIT Planar Hand 

A.2.1 Mechanical Design 

This hand has two fingers, each with two joints. A belt pulley system connects the servo 
motors to the joints. Position sensing is provided by shaft encoders mounted on each 
motor. The finger surfaces are flat, providing a good surfaces for whole hand grasping. 
The hand is designed to allow reconfiguration of the separation distance between the 
fingers. 
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A.2.2 Control Architecture 

The hand is interfaced to an IBM PC computer running PC/NFS, and uses a controller 
board based on the Hewlett-Packard HCTL-1000 servo processor. This chip provides low 
level servo control, along with the interface logic necessary to process the optical position 
encoder output from each joint. A control system was implemented that executed the 
grasping strategies used for building recognition memories. After executing a grasp, the 
resulting joint positions (the only sensory information required for recognition by the 
memory-based method) is written to a file on an NFS mounted disk. The recognition 
software is written in Lisp and runs on a Symbolics Lisp Machine. The Lisp Machine 
reads the joint data from the NFS mounted filesystem and finds consistent poses in 
real-time. Figure A.2 diagrams the system. 


A.3 The Salisbury Hand 

A.3.1 Mechanical Design 

A Salisbury [90] hand, mounted to a Puma 500 robotic arm, was used for the refinement 
experiments described in Chapter 5. The hand has three fingers, each with three joints, 
as diagrammed in Figure 2.1. The actuator system uses n + 1 motors for n joints. Thus, 
there are four motors for each finger, or twelve for the entire hand. Tendons connect the 
actuators to the joints. 

The hand is equipped with Brock and Chiu [15] force sensing fingertips. Each of the 
fingertips has eight strain gauge sensors that are used to detected the contact forces and 
torques. The strain gauges are connected to analog amplifiers which are interfaced to 
the controller computer. 
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Figure A.3: Block diagram of the Salisbury hand system, as used for the pose refinement 
experiments. 

A.3.2 Control Architecture 

The control architecture for the Puma-Hand system is based on the VxWorks [108] 
real-time operating system running on Motorola 68030 processors. VxWorks provides 
a low-level kernel with a fast scheduler, networking tools, and a standard Unix-style 
library interface. The operating system provides NFS filesystem access to a host Sun 
workstation that is used for program development. A VMEbus backplane is used to 
interconnect three Motorola 68030 processors that each run VxWorks. Figure A.l dia¬ 
grams the system. 

The Salisbury-JPL hand is interfaced to two Unimation servo controllers, which pro¬ 
vide low level position control. The servo controllers are interfaced, via a parallel port, 
to the VxWorks system. A software module on one 68030 processor feeds position com¬ 
mands to the Unimation controllers. A higher level message-based trajectory controller, 
for enqueuing a sequence of position setpoints, provides the external software interface 
to the system. 

The refinement code is written in C and runs on the VxWorks real-time system. The 
code communicates with the Puma, the hand, and the fingertip sensors. The hand is 
first commanded to close on an object until contact is detected by the fingertip sensors. 
The Puma then sweeps out a motion while the sensors are sampled. The refinement 
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The second approach is based on storing past recognition experiences in a memory. 
Recognition becomes a fast lookup operation. The number of possible grasps stored in 
the memory can be kept small by exploiting a grasp acquisition strategy constraint. The 
memory can be further compacted by using interpolation schemes. Various approaches 
for organizing, filling, and using the recognition memory are investigated. 

The third approach explores how additional contact sensing information can be used 
for recognition. Fingertip force sensors are used to measure contact surface normals. 
An object’s pose can be refined by fitting these normals to a model of the object. Since 
contact sensors produce local readings, and since the fitting process requires global 
information, calibration becomes an important issue. The method explored provides a 
way to self calibrate for sensor orientation errors. 

An important theme throughout the report is that knowledge of the strategy the 
hand uses to acquire an object provides a very important recognition constraint. While 
a dexterous hand has a very large configuration space, the number of unique grasps that 
correspond to a given strategy are limited. Knowledge of the grasp acquisition strategy 
makes pose determination easier. 
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